GitHub Next | Extract, Edit, Apply

One of the core problems in software development is modifying existing codebases, particularly when working with large, complex systems where understanding the code and its specifications can be challenging.

There are multiple problems with natural language "programming" where words, specifications or documentation are primary. These include:

The inherent ambiguity of language.
The absence of specifications or documentation in many existing codebases.
The difficulty of maintaining natural language specifications and documentation under change.
The difficulty for non-native speakers to work with highly technical language.
The non-deterministic nature of LLMs including code generation, even when intent is unambiguous.
The instability of LLM code-generation under otherwise small or unimportant changes to inputs.

Another human problem arises with natural language programming: at each creation or task description the user must "find the words" - find the vocabulary, the concepts and the precision to describe the change they intend. Often the user has no idea how to do this - words can be hard! A similar problem is the "reference problem": users must find words to refer to exact code locations, functions, methods, classes, feature points, design layers, visual elements and other logical entities sufficiently unambiguously. This can be extremely difficult.

Core Idea

What if we start from the opposite position: what if code is primary, and specifications (words) are secondary, while still embracing natural language as a valid way of describing change? This is the starting point of the Extract, Edit, Apply (EEA) concept we have been exploring at GitHub Next.

EEA revolves around the notion of ephemeral, editable, partial specifications. The paradigm is to make code permanent and specifications ephemeral: users can edit either the code or ephemeral specs, which are essentially code summaries that can be generated, modified, and discarded as needed. If the user edits an ephemeral specification, the toolchain will offer a code change corresponding to the spec change. The user can then accept or reject the code change, and the toolchain will apply the change to the codebase.

Generality

EEA gives a portal to multiple different approaches to software development, including:

Specification by properties
Specification by example
Specification by contract
Specification by design
Specification by test
Specification by requirement
Specification by formal language

Examples of these are described in the report.

Report

A GitHub Next technical report is available.

What’s next?

The Extract, Edit, Apply (EEA) concept represents a new class of assists that can be used to incorporate natural language summarization and editing even when working with complex artifacts. In this report we are reporting a concept: it is one we have implemented, used, and liked, and believe is promising. EEA is not a replacement for existing approaches, but rather a new way of thinking about how to integrate natural language into the software development process.

We have concluded this investigation and invite you to read our technical report. We will be looking at applying this technique where necessary in our own projects.