The First Workshop on Natural Language Processing for Programming
Program
Update Aug 18: Thank everyone for attending the NLP4Prog workshop! Our video recordings are available here.
Eastern Time (UTC-4) | Title |
---|---|
9:00-9:05 | Opening Remarks |
9:05-9:35 | Invited Talk: Eran Yahav – Pair Programming with Structural Language Models |
9:35-10:05 | Invited Talk: Charles Sutton – Learning, Search, and Program Synthesis |
10:05-10:30 | Contributed Talk: Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors / Junayed Mahmud, Fahim Faisal, Raihan Islam Arnob, Antonios Anastasopoulos and Kevin Moran Contributed Talk: ConTest: A Unit Test Completion Benchmark featuring Context / Johannes Villmow, Jonas Depoix and Adrian Ulges |
10:30-11:30 | First Poster and Spotlight Talk sessions - [9] Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors - [17] ConTest: A Unit Test Completion Benchmark featuring Context - [2] CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model - [19] Time-Efficient Code Completion Model for the R Language - [6] Text-to-SQL in the wild: A naturally-occurring dataset based on Stack Exchange Data - [26] DIRECT : A Transformer-based Model for Decompiled Identifier Renaming - [10] Text2App: A Framework for Creating Android Apps from Text Descriptions - [16] AUTOSQL: Question Auto-Completion for Text-to-SQL - [18] Don’t code, schedule! Annotating Instructional Executable Text at Scale Without Teaching Annotators How to Code - [20] Backdoors in Neural Models of Source Code _ - [21] Draw Me a Flower: Grounding Abstract Structures in Executable Instructions - [11] CoDesc: A Large Code–Description Parallel Dataset - [3] IdBench: Evaluating Semantic Representations of Identifier Names in Source Code - [3506] Logic-Consistency Text Generation from Semantic Parses - [873] Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing |
11:30-12:00 | Invited Talk: Mirella Lapata –The Democratization of Semantic Parsing via Zero-Shot Cross-lingual Learning |
12:00-12:30 | Invited Talk: Julia Hockenmaier – Collaborative Construction and Communication with Minecraft |
12:30-13:30 | Break |
13:30-14:00 | Invited Talk: Percy Liang – Learning to Fix Programs |
14:00-14:30 | Invited Talk: Stefanie Tellex – Towards Complex Language in Partially Observed Environments |
14:30-15:15 | Open Discussion |
15:15-15:45 | Invited Talk: Lin Tan – Software Text Analytics for Finding and Fixing Software Bugs |
15:45-16:15 | Invited Talk: Brad Myers – Programming by Natural Language and Demonstration |
16:15-16:40 | Contributed Talk: In-IDE Code Generation from Natural Language: Promise and Challenges / Frank F. Xu, Bogdan Vasilescu and Graham Neubig Contributed Talk: Leveraging Language to Learn Program Abstractions and Search Heuristics / Catherine Wong, Kevin Ellis, Jacob Andreas and Joshua Tenenbaum |
16:40-17:35 | Second Poster and Spotlight Talk Session - [31] In-IDE Code Generation from Natural Language: Promise and Challenges - [4] Leveraging Language to Learn Program Abstractions and Search Heuristics - [24] CoTexT: Multi-task Learning with Code-Text Transformer - [12] Shellcode_IA32: A Dataset for Automatic Shellcode Generation - [27] Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation - [28] Bag-of-Words Baselines for Semantic Code Search - [5] ArcaneQA: Strongly Generalizable Question Answering on Large-Scale Knowledge Bases - [13] Energy-Based Models for Code Generation under Compilability Constraints - [15] SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications - [25] Communicating Natural Programs to Humans and Machines - [29] Community-Driven Alignment of Natural and Programming Languages - [23] Unified Pre-training for Program Understanding and Generation - [3468] Analysis of Tree-Structured Architectures for Code Generation -[3552] Disentangled Code Representation Learning for Multiple Programming Languages |
17:35-17:45 | Concluding Remarks |
Accepted Papers
Check out the workshop proceedings here. Congratulations to all the authors, and thanks to all reviewers for their hard work!
Regular Papers
- CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model [paper]
Author(s): Tae Hwan Jung - Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data [paper]
Author(s): Moshe Hazoom, Vibhor Malik and Ben Bogin - Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors [paper]
Author(s): Junayed Mahmud, Fahim Faisal, Raihan Islam Arnob, Antonios Anastasopoulos and Kevin Moran - Shellcode_IA32: A Dataset for Automatic Shellcode Generation [paper]
Author(s): Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella, Bojan Cukic and Samira Shaikh - ConTest: A Unit Test Completion Benchmark featuring Context [paper]
Author(s): Johannes Villmow, Jonas Depoix and Adrian Ulges - Time-Efficient Code Completion Model for the R Programming Language [paper]
Author(s): Artem Popov, Dmitrii Orekhov, Denis Litvinov, Nikolay Korolev and Gleb Morgachev - CoTexT: Multi-task Learning with Code-Text Transformer [paper]
Author(s): Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian and Yanfang Ye - DIRECT : A Transformer-based Model for Decompiled Identifier Renaming [paper]
Author(s): Vikram Nitin, Anthony Saieva, Baishakhi Ray and Gail Kaiser - Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation [paper]
Author(s): Gabriel Orlanski and Alex Gittens - Bag-of-Words Baselines for Semantic Code Search [paper]
Author(s): Xinyu Zhang, Ji Xin, Andrew Yates and Jimmy Lin
Non-Archival Papers
- IdBench: Evaluating Semantic Representations of Identifier Names in Source Code
Author(s): Yaza Wainakh, Moiz Rauf and Michael Pradel - Leveraging Language to Learn Program Abstractions and Search Heuristics
Author(s): Catherine Wong, Kevin Ellis, Jacob Andreas and Joshua Tenenbaum - ArcaneQA: Strongly Generalizable Question Answering on Large-Scale Knowledge Bases
Author(s): Yu Gu and Yu Su - Text2App: A Framework for Creating Android Apps from Text Descriptions
Author(s): Masum Hasan, Kazi Sajeed Mehrab, Wasi Ahmad and Rifat Shahriyar - CoDesc: A Large Code–Description Parallel Dataset
Author(s): Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Ahmad, Anindya Iqbal and Rifat Shahriyar - Energy-Based Models for Code Generation under Compilability Constraints
Author(s): Tomasz Korbak, Hady Elsahar, Germán Kruszewski and Marc Dymetman - SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications
Author(s): Sayontan Ghosh, Amanpreet Singh, Alex Merenstein, Wei Su, Scott Smolka, Erez Zadok and Niranjan Balasubramanian - AUTOSQL: Question Auto-Completion for Text-to-SQL
Author(s): Naihao Deng, Shuaichen Chang, Peng Shi, Tao Yu and Rui Zhang - Don’t code, schedule! Annotating Instructional Executable Text at Scale Without Teaching Annotators How to Code
- Backdoors in Neural Models of Source Code
- Draw Me a Flower: Grounding Abstract Structures in Executable Instructions
Author(s): Royi Lachmy, Valentina Pyatkin and Reut Tsarfaty - Unified Pre-training for Program Understanding and Generation
Author(s): Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray and Kai-Wei Chang - Communicating Natural Programs to Humans and Machines
- Community-Driven Alignment of Natural and Programming Languages
- In-IDE Code Generation from Natural Language: Promise and Challenges
Author(s): Frank F. Xu, Bogdan Vasilescu and Graham Neubig
Presentations from Findings of ACL
- Analysis of Tree-Structured Architectures for Code Generation
Author(s): Samip Dahal, Adyasha Maharana, Mohit Bansal - Disentangled Code Representation Learning for Multiple Programming Languages
Author(s): Jingfeng Zhang, Haiwen Hong, Yin Zhang, Yao Wan, Ye Liu, Yulei Sui - Logic-Consistency Text Generation from Semantic Parses
Author(s): Chang Shu, Yusen Zhang, Xiangyu Dong, Peng Shi, Tao Yu and Rui Zhang - Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
Author(s): Qian Liu, Dejian Yang, Jiahui Zhang, Jiaqi Guo, Bin Zhou and Jian-Guang LOU