The First Workshop on Natural Language Processing for Programming

Program

Update Aug 18: Thank everyone for attending the NLP4Prog workshop! Our video recordings are available here.

Eastern Time (UTC-4)	Title
9:00-9:05	Opening Remarks
9:05-9:35	Invited Talk: Eran Yahav – Pair Programming with Structural Language Models
9:35-10:05	Invited Talk: Charles Sutton – Learning, Search, and Program Synthesis
10:05-10:30	Contributed Talk: Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors / Junayed Mahmud, Fahim Faisal, Raihan Islam Arnob, Antonios Anastasopoulos and Kevin Moran Contributed Talk: ConTest: A Unit Test Completion Benchmark featuring Context / Johannes Villmow, Jonas Depoix and Adrian Ulges
10:30-11:30	First Poster and Spotlight Talk sessions - [9] Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors - [17] ConTest: A Unit Test Completion Benchmark featuring Context - [2] CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model - [19] Time-Efficient Code Completion Model for the R Language - [6] Text-to-SQL in the wild: A naturally-occurring dataset based on Stack Exchange Data - [26] DIRECT : A Transformer-based Model for Decompiled Identifier Renaming - [10] Text2App: A Framework for Creating Android Apps from Text Descriptions - [16] AUTOSQL: Question Auto-Completion for Text-to-SQL - [18] Don’t code, schedule! Annotating Instructional Executable Text at Scale Without Teaching Annotators How to Code - [20] Backdoors in Neural Models of Source Code _ - [21] Draw Me a Flower: Grounding Abstract Structures in Executable Instructions - [11] CoDesc: A Large Code–Description Parallel Dataset - [3] IdBench: Evaluating Semantic Representations of Identifier Names in Source Code - [3506] Logic-Consistency Text Generation from Semantic Parses - [873] Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
11:30-12:00	Invited Talk: Mirella Lapata –The Democratization of Semantic Parsing via Zero-Shot Cross-lingual Learning
12:00-12:30	Invited Talk: Julia Hockenmaier – Collaborative Construction and Communication with Minecraft
12:30-13:30	Break
13:30-14:00	Invited Talk: Percy Liang – Learning to Fix Programs
14:00-14:30	Invited Talk: Stefanie Tellex – Towards Complex Language in Partially Observed Environments
14:30-15:15	Open Discussion
15:15-15:45	Invited Talk: Lin Tan – Software Text Analytics for Finding and Fixing Software Bugs
15:45-16:15	Invited Talk: Brad Myers – Programming by Natural Language and Demonstration
16:15-16:40	Contributed Talk: In-IDE Code Generation from Natural Language: Promise and Challenges / Frank F. Xu, Bogdan Vasilescu and Graham Neubig Contributed Talk: Leveraging Language to Learn Program Abstractions and Search Heuristics / Catherine Wong, Kevin Ellis, Jacob Andreas and Joshua Tenenbaum
16:40-17:35	Second Poster and Spotlight Talk Session - [31] In-IDE Code Generation from Natural Language: Promise and Challenges - [4] Leveraging Language to Learn Program Abstractions and Search Heuristics - [24] CoTexT: Multi-task Learning with Code-Text Transformer - [12] Shellcode_IA32: A Dataset for Automatic Shellcode Generation - [27] Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation - [28] Bag-of-Words Baselines for Semantic Code Search - [5] ArcaneQA: Strongly Generalizable Question Answering on Large-Scale Knowledge Bases - [13] Energy-Based Models for Code Generation under Compilability Constraints - [15] SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications - [25] Communicating Natural Programs to Humans and Machines - [29] Community-Driven Alignment of Natural and Programming Languages - [23] Unified Pre-training for Program Understanding and Generation - [3468] Analysis of Tree-Structured Architectures for Code Generation -[3552] Disentangled Code Representation Learning for Multiple Programming Languages
17:35-17:45	Concluding Remarks

Accepted Papers

Check out the workshop proceedings here. Congratulations to all the authors, and thanks to all reviewers for their hard work!

Regular Papers

CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model [paper]
Author(s): Tae Hwan Jung
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data [paper]
Author(s): Moshe Hazoom, Vibhor Malik and Ben Bogin
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors [paper]
Author(s): Junayed Mahmud, Fahim Faisal, Raihan Islam Arnob, Antonios Anastasopoulos and Kevin Moran
Shellcode_IA32: A Dataset for Automatic Shellcode Generation [paper]
Author(s): Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella, Bojan Cukic and Samira Shaikh
ConTest: A Unit Test Completion Benchmark featuring Context [paper]
Author(s): Johannes Villmow, Jonas Depoix and Adrian Ulges
Time-Efficient Code Completion Model for the R Programming Language [paper]
Author(s): Artem Popov, Dmitrii Orekhov, Denis Litvinov, Nikolay Korolev and Gleb Morgachev
CoTexT: Multi-task Learning with Code-Text Transformer [paper]
Author(s): Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian and Yanfang Ye
DIRECT : A Transformer-based Model for Decompiled Identifier Renaming [paper]
Author(s): Vikram Nitin, Anthony Saieva, Baishakhi Ray and Gail Kaiser
Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation [paper]
Author(s): Gabriel Orlanski and Alex Gittens
Bag-of-Words Baselines for Semantic Code Search [paper]
Author(s): Xinyu Zhang, Ji Xin, Andrew Yates and Jimmy Lin

Non-Archival Papers

IdBench: Evaluating Semantic Representations of Identifier Names in Source Code
Author(s): Yaza Wainakh, Moiz Rauf and Michael Pradel
Leveraging Language to Learn Program Abstractions and Search Heuristics
Author(s): Catherine Wong, Kevin Ellis, Jacob Andreas and Joshua Tenenbaum
ArcaneQA: Strongly Generalizable Question Answering on Large-Scale Knowledge Bases
Author(s): Yu Gu and Yu Su
Text2App: A Framework for Creating Android Apps from Text Descriptions
Author(s): Masum Hasan, Kazi Sajeed Mehrab, Wasi Ahmad and Rifat Shahriyar
CoDesc: A Large Code–Description Parallel Dataset
Author(s): Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Ahmad, Anindya Iqbal and Rifat Shahriyar
Energy-Based Models for Code Generation under Compilability Constraints
Author(s): Tomasz Korbak, Hady Elsahar, GermÃ¡n Kruszewski and Marc Dymetman
SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications
Author(s): Sayontan Ghosh, Amanpreet Singh, Alex Merenstein, Wei Su, Scott Smolka, Erez Zadok and Niranjan Balasubramanian
AUTOSQL: Question Auto-Completion for Text-to-SQL
Author(s): Naihao Deng, Shuaichen Chang, Peng Shi, Tao Yu and Rui Zhang
Don’t code, schedule! Annotating Instructional Executable Text at Scale Without Teaching Annotators How to Code
Backdoors in Neural Models of Source Code
Draw Me a Flower: Grounding Abstract Structures in Executable Instructions
Author(s): Royi Lachmy, Valentina Pyatkin and Reut Tsarfaty
Unified Pre-training for Program Understanding and Generation
Author(s): Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray and Kai-Wei Chang
Communicating Natural Programs to Humans and Machines
Community-Driven Alignment of Natural and Programming Languages
In-IDE Code Generation from Natural Language: Promise and Challenges
Author(s): Frank F. Xu, Bogdan Vasilescu and Graham Neubig

Presentations from Findings of ACL

Analysis of Tree-Structured Architectures for Code Generation
Author(s): Samip Dahal, Adyasha Maharana, Mohit Bansal
Disentangled Code Representation Learning for Multiple Programming Languages
Author(s): Jingfeng Zhang, Haiwen Hong, Yin Zhang, Yao Wan, Ye Liu, Yulei Sui
Logic-Consistency Text Generation from Semantic Parses
Author(s): Chang Shu, Yusen Zhang, Xiangyu Dong, Peng Shi, Tao Yu and Rui Zhang
Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
Author(s): Qian Liu, Dejian Yang, Jiahui Zhang, Jiaqi Guo, Bin Zhou and Jian-Guang LOU