The First Workshop on Natural Language Processing for Programming

Program

Update Aug 18: Thank everyone for attending the NLP4Prog workshop! Our video recordings are available here.

Eastern Time (UTC-4) Title
9:00-9:05 Opening Remarks
9:05-9:35 Invited Talk: Eran YahavPair Programming with Structural Language Models
9:35-10:05 Invited Talk: Charles SuttonLearning, Search, and Program Synthesis
10:05-10:30 Contributed Talk: Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors / Junayed Mahmud, Fahim Faisal, Raihan Islam Arnob, Antonios Anastasopoulos and Kevin Moran

Contributed Talk: ConTest: A Unit Test Completion Benchmark featuring Context / Johannes Villmow, Jonas Depoix and Adrian Ulges
10:30-11:30 First Poster and Spotlight Talk sessions
- [9] Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors
- [17] ConTest: A Unit Test Completion Benchmark featuring Context
- [2] CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model
- [19] Time-Efficient Code Completion Model for the R Language
- [6] Text-to-SQL in the wild: A naturally-occurring dataset based on Stack Exchange Data
- [26] DIRECT : A Transformer-based Model for Decompiled Identifier Renaming
- [10] Text2App: A Framework for Creating Android Apps from Text Descriptions
- [16] AUTOSQL: Question Auto-Completion for Text-to-SQL
- [18] Don’t code, schedule! Annotating Instructional Executable Text at Scale Without Teaching Annotators How to Code
- [20] Backdoors in Neural Models of Source Code _
- [21] Draw Me a Flower: Grounding Abstract Structures in Executable Instructions
- [11] CoDesc: A Large Code–Description Parallel Dataset
- [3] IdBench: Evaluating Semantic Representations of Identifier Names in Source Code
- [3506] Logic-Consistency Text Generation from Semantic Parses
- [873] Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
11:30-12:00 Invited Talk: Mirella LapataThe Democratization of Semantic Parsing via Zero-Shot Cross-lingual Learning
12:00-12:30 Invited Talk: Julia HockenmaierCollaborative Construction and Communication with Minecraft
12:30-13:30 Break
13:30-14:00 Invited Talk: Percy LiangLearning to Fix Programs
14:00-14:30 Invited Talk: Stefanie TellexTowards Complex Language in Partially Observed Environments
14:30-15:15 Open Discussion
15:15-15:45 Invited Talk: Lin TanSoftware Text Analytics for Finding and Fixing Software Bugs
15:45-16:15 Invited Talk: Brad MyersProgramming by Natural Language and Demonstration
16:15-16:40 Contributed Talk: In-IDE Code Generation from Natural Language: Promise and Challenges / Frank F. Xu, Bogdan Vasilescu and Graham Neubig

Contributed Talk: Leveraging Language to Learn Program Abstractions and Search Heuristics / Catherine Wong, Kevin Ellis, Jacob Andreas and Joshua Tenenbaum
16:40-17:35 Second Poster and Spotlight Talk Session
- [31] In-IDE Code Generation from Natural Language: Promise and Challenges
- [4] Leveraging Language to Learn Program Abstractions and Search Heuristics
- [24] CoTexT: Multi-task Learning with Code-Text Transformer
- [12] Shellcode_IA32: A Dataset for Automatic Shellcode Generation
- [27] Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation
- [28] Bag-of-Words Baselines for Semantic Code Search
- [5] ArcaneQA: Strongly Generalizable Question Answering on Large-Scale Knowledge Bases
- [13] Energy-Based Models for Code Generation under Compilability Constraints
- [15] SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications
- [25] Communicating Natural Programs to Humans and Machines
- [29] Community-Driven Alignment of Natural and Programming Languages
- [23] Unified Pre-training for Program Understanding and Generation
- [3468] Analysis of Tree-Structured Architectures for Code Generation
-[3552] Disentangled Code Representation Learning for Multiple Programming Languages
17:35-17:45 Concluding Remarks

Accepted Papers

Check out the workshop proceedings here. Congratulations to all the authors, and thanks to all reviewers for their hard work!

Regular Papers

  • CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model [paper]
    Author(s): Tae Hwan Jung
  • Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data [paper]
    Author(s): Moshe Hazoom, Vibhor Malik and Ben Bogin
  • Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors [paper]
    Author(s): Junayed Mahmud, Fahim Faisal, Raihan Islam Arnob, Antonios Anastasopoulos and Kevin Moran
  • Shellcode_IA32: A Dataset for Automatic Shellcode Generation [paper]
    Author(s): Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella, Bojan Cukic and Samira Shaikh
  • ConTest: A Unit Test Completion Benchmark featuring Context [paper]
    Author(s): Johannes Villmow, Jonas Depoix and Adrian Ulges
  • Time-Efficient Code Completion Model for the R Programming Language [paper]
    Author(s): Artem Popov, Dmitrii Orekhov, Denis Litvinov, Nikolay Korolev and Gleb Morgachev
  • CoTexT: Multi-task Learning with Code-Text Transformer [paper]
    Author(s): Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian and Yanfang Ye
  • DIRECT : A Transformer-based Model for Decompiled Identifier Renaming [paper]
    Author(s): Vikram Nitin, Anthony Saieva, Baishakhi Ray and Gail Kaiser
  • Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation [paper]
    Author(s): Gabriel Orlanski and Alex Gittens
  • Bag-of-Words Baselines for Semantic Code Search [paper]
    Author(s): Xinyu Zhang, Ji Xin, Andrew Yates and Jimmy Lin

Non-Archival Papers

  • IdBench: Evaluating Semantic Representations of Identifier Names in Source Code
    Author(s): Yaza Wainakh, Moiz Rauf and Michael Pradel
  • Leveraging Language to Learn Program Abstractions and Search Heuristics
    Author(s): Catherine Wong, Kevin Ellis, Jacob Andreas and Joshua Tenenbaum
  • ArcaneQA: Strongly Generalizable Question Answering on Large-Scale Knowledge Bases
    Author(s): Yu Gu and Yu Su
  • Text2App: A Framework for Creating Android Apps from Text Descriptions
    Author(s): Masum Hasan, Kazi Sajeed Mehrab, Wasi Ahmad and Rifat Shahriyar
  • CoDesc: A Large Code–Description Parallel Dataset
    Author(s): Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Ahmad, Anindya Iqbal and Rifat Shahriyar
  • Energy-Based Models for Code Generation under Compilability Constraints
    Author(s): Tomasz Korbak, Hady Elsahar, Germán Kruszewski and Marc Dymetman
  • SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications
    Author(s): Sayontan Ghosh, Amanpreet Singh, Alex Merenstein, Wei Su, Scott Smolka, Erez Zadok and Niranjan Balasubramanian
  • AUTOSQL: Question Auto-Completion for Text-to-SQL
    Author(s): Naihao Deng, Shuaichen Chang, Peng Shi, Tao Yu and Rui Zhang
  • Don’t code, schedule! Annotating Instructional Executable Text at Scale Without Teaching Annotators How to Code
  • Backdoors in Neural Models of Source Code
  • Draw Me a Flower: Grounding Abstract Structures in Executable Instructions
    Author(s): Royi Lachmy, Valentina Pyatkin and Reut Tsarfaty
  • Unified Pre-training for Program Understanding and Generation
    Author(s): Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray and Kai-Wei Chang
  • Communicating Natural Programs to Humans and Machines
  • Community-Driven Alignment of Natural and Programming Languages
  • In-IDE Code Generation from Natural Language: Promise and Challenges
    Author(s): Frank F. Xu, Bogdan Vasilescu and Graham Neubig

Presentations from Findings of ACL

  • Analysis of Tree-Structured Architectures for Code Generation
    Author(s): Samip Dahal, Adyasha Maharana, Mohit Bansal
  • Disentangled Code Representation Learning for Multiple Programming Languages
    Author(s): Jingfeng Zhang, Haiwen Hong, Yin Zhang, Yao Wan, Ye Liu, Yulei Sui
  • Logic-Consistency Text Generation from Semantic Parses
    Author(s): Chang Shu, Yusen Zhang, Xiangyu Dong, Peng Shi, Tao Yu and Rui Zhang
  • Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
    Author(s): Qian Liu, Dejian Yang, Jiahui Zhang, Jiaqi Guo, Bin Zhou and Jian-Guang LOU