Skip to Content

J: A Simple Programming Language

To have a variety of compiler projects for CS 370, sometimes we need to be creative. So I made up my own programming language, which I call J!

One principle of J is that it is keyword-heavy. Lots of things have explicit keywords that introduce them.

NOTE: This page is a work in progress and there can be typos and errors in the grammar and programs.

NOTE: The Fall 2024 CS 370 project, which is the first use of the J language, did not follow this grammar exactly. I’m not sure which direction I will officially change it yet…

The Basic J Language Structure

A J program is constructed of three parts in sequence:

  1. Global variable declarations
  2. Function declarations
  3. The main program

Comments begin with ‘#’ and go to the end of the line. Comments are not handled in the grammar; the scanner should remove them as it does lexical analysis.

Global variables are declared one at a time, beginning with the keyword global and ending with a semicolon. Lists of variables are not allowed.

Functions are declared beginning with the keyword function. All functions implicitly return an integer value. Function parameter declarations are enclosed in parentheses.

The main program begins with the keyword program.

Function bodies and the main program are enclosed in curly braces.

Function bodies and the main program can begin with local variable declarations (similar to globals but with the keyword local), and then a sequence of statements.

A small list of built-in “library” functions are available. These are called as normal functions, using the keyword call. The library functions will vary depending on the target platform. For RISC-V we use the available I/O in the RARS simulator, wrapped in functions (e.g., printStr()). For x86-64 we use the C library (e.g., printf() and others).

Example Programs in J

Two simple programs are below, a longer program is at the bottom of this page.

Hello world:

# Hello World in J
program {
   call printStr("Hello World!\n");
}

A simple value entry:

# Read and print a value
program {
   local int val;
   call readInt(): val;
   call printStr("Entered value is: ");
   call printInt(val);
   call printStr("\n");
}

High Level Grammar

In the grammar below, all-caps indicates a keyword, or a punctuation symbol; in the language, the keywords are all lowercase. Note that this is a real but perhaps abstract grammar; in CS 370 we may, for example, split the rules for different variable declarations into separate non-terminals rather than having them all use vardecl.

wholeprogram :- globals functions program

globals :- /empty/
globals :- global globals
global :- GLOBAL vardecl SEMICOLON

functions :- /empty/
functions :- function functions
function :- FUNCTION IDString LPAREN parameters RPAREN LBRACE locals statements RBRACE
            
program :- PROGRAM LBRACE locals statements RBRACE

parameters :- /empty/
parameters :- parameter
parameters :- parameter COMMA parameters
parameter :- vardecl

locals :- /empty/
locals :- local locals
local :- LOCAL vardecl SEMICOLON

vardecl :- type IDString
vardecl :- type IDString LBRACKET NUMBER RBRACKET
type :- INT
type :- STRING

statements :- /empty/
statements :- statement statements
statement :- assignment | ifthenelse | functioncall | whileloop | forloop | foreachloop | return

assignment :- IDString EQUALS expression SEMICOLON

ifthenelse :- IF LPAREN boolexpr RPAREN THEN LBRACE statements RBRACE ELSE LBRACE statements RBRACE

functioncall :- CALL IDString LPAREN arguments RPAREN SEMICOLON
functioncall :- CALL IDString LPAREN arguments RPAREN COLON IDString SEMICOLON

whileloop :- WHILE LPAREN boolexpr RPAREN DO LBRACE statements RBRACE

forloop :- FOR LPAREN INT IDString FROM expression TO expression RPAREN LBRACE statements RBRACE

foreachloop :- FOREACH LPAREN INT IDString IN IDString RPAREN DO LBRACE statements RBRACE

return :- RETURN expression SEMICOLON

arguments :- /empty/
arguments :- argument
arguments :- argument COMMA arguments
argument :- expression

expression :- term
expression :- term ADDOP term
term :- factor
term :- factor MULOP factor
factor :- varref
factor :- NUMBER
factor :- STRING
factor :- LPAREN expression RPAREN

varref :- IDString
varref :- IDString LBRACKET expression RBRACKET

boolexpr :- boolterm
boolexpr :- NOT boolterm
boolterm :- boolterm LOGICALOR boolterm
boolterm :- boolterm LOGICALAND boolterm
boolterm :- boolfactor
boolfactor :- expression RELOP expression
boolfactor :- LPAREN boolexpr RPAREN

Grammar Notes

Boolean (condition) expressions are separated from normal expressions. Normal expressions with operators are integer computations; if a string variable or constant is used with operators, a syntax error should be generated. String constants and variables can only be used without operators (i.e., by themselves).

Operator precedence is built into the expression rules, and is normal: multiplication operators have higher precedence than addition operators, and relational operators have higher precedence than logical operators. Other than that, explicit parentheses must be used to control and form complex expressions. Operators at the same precedence level are evaluated left to right.

Expressions in array indexing must be integer expressions. Array indices start at 0.

Behavior Notes

Integers are 32 bits. String and array arguments are passed by reference (address). Integers are passed by value. String references (pointers) are whatever size is needed in the target architecture, typically 32 or 64 bits.

Library Functions

As stated above, the library functions will vary depending on the target platform. For RISC-V we use the available I/O in the RARS simulator, wrapped in functions (e.g., printStr()). For x86-64 we use the C library (e.g., printf() and others). The input function scanf() is harder to use in J because we do not have a reference capability for integers.

In RISC-V we used:

  1. printStr(string s) - prints the string to output
  2. printInt(int v) - prints the integer value in decimal
  3. readInt() - reads an integer value from input and returns it but there is more we can do based on RARS system calls.

TODO

  1. Need to define more RISC-V library functions.
  2. Need to define an interface for scanf().

A Bigger Example Program

global vals[100];

function populate(int seed)
{
   for (int i from 0 to 99) do {
      vals[i] = seed + i;
   }
}

function doSum(int a[100])
{
   local int s;
   s = 0;
   for (int i from 0 to 99) do {
      sum = sum + vals[i];
   }
   return s;
}

program 
{
   local int sum;
   call populate(42);
   call doSum(vals) : sum
   call printStr("Array sum is: ");
   call printInt(sum);
   call printStr("\n");
}  

And another program

global int x;
global int ynotused;
global int arr[100];

function makePattern(string s1, string s2, int val)
{
   int y;
   call printStr("Arg val is: ");
   call printInt(val);
   call printStr("\nPrinting a pattern\n");
   while (x != 0) do {
      y = 0;
      while (y < x) do {
         call printStr("*");
         y = y + 1;
      }
      call printStr("\n");
      x = x - 1;
   }
}

function arrayFun(int size, int startVal)
{
   int i;
   int sum;
   i = 0; 
   while (i < size) do {
      arr[i] = startVal;
      startVal = startVal + 1;
      i = i + 1;
   }
   i = 0; 
   while (i < size) do {
      call printStr("arr[");
      call printInt(i);
      call printStr("] = ");
      call printInt(arr[i]);
      call printStr("\n");
      i = i + 1;
   }
   i = 0; 
   sum = 0;
   while (i < size) do {
      sum = sum + arr[i];
      i = i + 1;
   }
   call printStr("array sum = ");
   call printInt(sum);
   call printStr("\n");
}

program {
   call printStr("Enter value for x: ");
   call readInt();
   x = returnvalue;
   if (x > 100) then {
      call printStr("x is over 100!\n");
   } else {
      call printStr("x is 100 or less!\n");
   }
   call makePattern("hello", "goodbye", 42);
   call printStr("Enter starting array value: ");
   call readInt();
   x = returnvalue;
   call arrayFun(20,x);
   call printStr("Program done.\n");
}