blob: 7daf5e1c9870a770f39efa57344959cdc4c2d88c [file] [log] [blame] [view]
Marcel van Lohuizen6f0faec2018-12-16 10:42:42 +01001<!--
2 Copyright 2018 The CUE Authors
3
4 Licensed under the Apache License, Version 2.0 (the "License");
5 you may not use this file except in compliance with the License.
6 You may obtain a copy of the License at
7
8 http://www.apache.org/licenses/LICENSE-2.0
9
10 Unless required by applicable law or agreed to in writing, software
11 distributed under the License is distributed on an "AS IS" BASIS,
12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 See the License for the specific language governing permissions and
14 limitations under the License.
15-->
16
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010017# The CUE Language Specification
18
19## Introduction
20
Marcel van Lohuizen5953c662019-01-26 13:26:04 +010021This is a reference manual for the CUE data constraint language.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010022CUE, pronounced cue or Q, is a general-purpose and strongly typed
Marcel van Lohuizen5953c662019-01-26 13:26:04 +010023constraint-based language.
24It can be used for data templating, data validation, code generation, scripting,
25and many other applications involving structured data.
26The CUE tooling, layered on top of CUE, provides
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010027a general purpose scripting language for creating scripts as well as
Marcel van Lohuizen5953c662019-01-26 13:26:04 +010028simple servers, also expressed in CUE.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010029
30CUE was designed with cloud configuration, and related systems, in mind,
31but is not limited to this domain.
32It derives its formalism from relational programming languages.
33This formalism allows for managing and reasoning over large amounts of
Marcel van Lohuizen5953c662019-01-26 13:26:04 +010034data in a straightforward manner.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010035
36The grammar is compact and regular, allowing for easy analysis by automatic
37tools such as integrated development environments.
38
39This document is maintained by mpvl@golang.org.
40CUE has a lot of similarities with the Go language. This document draws heavily
Marcel van Lohuizen73f14eb2019-01-30 17:11:17 +010041from the Go specification as a result.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010042
43CUE draws its influence from many languages.
44Its main influences were BCL/ GCL (internal to Google),
45LKB (LinGO), Go, and JSON.
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +020046Others are Swift, Typescript, Javascript, Prolog, NCL (internal to Google),
Marcel van Lohuizen62658a82019-06-16 12:18:47 +020047Jsonnet, HCL, Flabbergast, Nix, JSONPath, Haskell, Objective-C, and Python.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010048
49
50## Notation
51
52The syntax is specified using Extended Backus-Naur Form (EBNF):
53
54```
55Production = production_name "=" [ Expression ] "." .
56Expression = Alternative { "|" Alternative } .
57Alternative = Term { Term } .
58Term = production_name | token [ "…" token ] | Group | Option | Repetition .
59Group = "(" Expression ")" .
60Option = "[" Expression "]" .
61Repetition = "{" Expression "}" .
62```
63
64Productions are expressions constructed from terms and the following operators,
65in increasing precedence:
66
67```
68| alternation
69() grouping
70[] option (0 or 1 times)
71{} repetition (0 to n times)
72```
73
74Lower-case production names are used to identify lexical tokens. Non-terminals
75are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes
76``.
77
78The form a … b represents the set of characters from a through b as
79alternatives. The horizontal ellipsis … is also used elsewhere in the spec to
80informally denote various enumerations or code snippets that are not further
81specified. The character … (as opposed to the three characters ...) is not a
Roger Peppeded0e1d2019-09-24 16:39:36 +010082token of the CUE language.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +010083
84
85## Source code representation
86
87Source code is Unicode text encoded in UTF-8.
88Unless otherwise noted, the text is not canonicalized, so a single
89accented code point is distinct from the same character constructed from
90combining an accent and a letter; those are treated as two code points.
91For simplicity, this document will use the unqualified term character to refer
92to a Unicode code point in the source text.
93
94Each code point is distinct; for instance, upper and lower case letters are
95different characters.
96
97Implementation restriction: For compatibility with other tools, a compiler may
98disallow the NUL character (U+0000) in the source text.
99
100Implementation restriction: For compatibility with other tools, a compiler may
101ignore a UTF-8-encoded byte order mark (U+FEFF) if it is the first Unicode code
102point in the source text. A byte order mark may be disallowed anywhere else in
103the source.
104
105
106### Characters
107
108The following terms are used to denote specific Unicode character classes:
109
110```
111newline = /* the Unicode code point U+000A */ .
112unicode_char = /* an arbitrary Unicode code point except newline */ .
113unicode_letter = /* a Unicode code point classified as "Letter" */ .
114unicode_digit = /* a Unicode code point classified as "Number, decimal digit" */ .
115```
116
117In The Unicode Standard 8.0, Section 4.5 "General Category" defines a set of
118character categories.
119CUE treats all characters in any of the Letter categories Lu, Ll, Lt, Lm, or Lo
120as Unicode letters, and those in the Number category Nd as Unicode digits.
121
122
123### Letters and digits
124
125The underscore character _ (U+005F) is considered a letter.
126
127```
128letter = unicode_letter | "_" .
129decimal_digit = "0" … "9" .
130octal_digit = "0" … "7" .
131hex_digit = "0" … "9" | "A" … "F" | "a" … "f" .
132```
133
134
135## Lexical elements
136
137### Comments
Marcel van Lohuizen7fc421b2019-09-11 09:24:03 +0200138Comments serve as program documentation.
139CUE supports line comments that start with the character sequence //
140and stop at the end of the line.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100141
Marcel van Lohuizen7fc421b2019-09-11 09:24:03 +0200142A comment cannot start inside a string literal or inside a comment.
143A comment acts like a newline.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100144
145
146### Tokens
147
148Tokens form the vocabulary of the CUE language. There are four classes:
149identifiers, keywords, operators and punctuation, and literals. White space,
150formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns
151(U+000D), and newlines (U+000A), is ignored except as it separates tokens that
152would otherwise combine into a single token. Also, a newline or end of file may
153trigger the insertion of a comma. While breaking the input into tokens, the
154next token is the longest sequence of characters that form a valid token.
155
156
157### Commas
158
159The formal grammar uses commas "," as terminators in a number of productions.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500160CUE programs may omit most of these commas using the following two rules:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100161
162When the input is broken into tokens, a comma is automatically inserted into
163the token stream immediately after a line's final token if that token is
164
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500165- an identifier
166- null, true, false, bottom, or an integer, floating-point, or string literal
167- one of the characters ), ], or }
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100168
169
170Although commas are automatically inserted, the parser will require
171explicit commas between two list elements.
172
173To reflect idiomatic use, examples in this document elide commas using
174these rules.
175
176
177### Identifiers
178
179Identifiers name entities such as fields and aliases.
Marcel van Lohuizen8a2df962019-11-10 00:14:24 +0100180An identifier is a sequence of one or more letters (which includes `_` and `$`)
181and digits.
182It may not be `_` or `$`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100183The first character in an identifier must be a letter.
184
185<!--
186TODO: allow identifiers as defined in Unicode UAX #31
187(https://unicode.org/reports/tr31/).
188
189Identifiers are normalized using the NFC normal form.
190-->
191
192```
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +0200193identifier = letter { letter | unicode_digit } .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100194```
195
196```
197a
198_x9
199fieldName
200αβ
201```
202
203<!-- TODO: Allow Unicode identifiers TR 32 http://unicode.org/reports/tr31/ -->
204
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500205Some identifiers are [predeclared](#predeclared-identifiers).
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100206
207
208### Keywords
209
210CUE has a limited set of keywords.
Marcel van Lohuizen40178752019-08-25 19:17:56 +0200211In addition, CUE reserves all identifiers starting with `__`(double underscores)
212as keywords.
213These are typically targets of pre-declared identifiers.
214
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100215All keywords may be used as labels (field names).
216They cannot, however, be used as identifiers to refer to the same name.
217
218
219#### Values
220
221The following keywords are values.
222
223```
224null true false
225```
226
227These can never be used to refer to a field of the same name.
228This restriction is to ensure compatibility with JSON configuration files.
229
230
231#### Preamble
232
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100233The following keywords are used at the preamble of a CUE file.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100234After the preamble, they may be used as identifiers to refer to namesake fields.
235
236```
237package import
238```
239
240
241#### Comprehension clauses
242
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100243The following keywords are used in comprehensions.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100244
245```
246for in if let
247```
248
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100249The keywords `for`, `if` and `let` cannot be used as identifiers to
Marcel van Lohuizen40178752019-08-25 19:17:56 +0200250refer to fields.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100251
252<!--
253TODO:
254 reduce [to]
255 order [by]
256-->
257
258
259#### Arithmetic
260
261The following pseudo keywords can be used as operators in expressions.
262
263```
264div mod quo rem
265```
266
267These may be used as identifiers to refer to fields in all other contexts.
268
269
270### Operators and punctuation
271
272The following character sequences represent operators and punctuation:
273
274```
Marcel van Lohuizen40178752019-08-25 19:17:56 +0200275+ div && == < = ( )
276- mod || != > :: { }
277* quo & =~ <= : [ ]
278/ rem | !~ >= . ... ,
279 _|_ !
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100280```
Marcel van Lohuizen40178752019-08-25 19:17:56 +0200281<!--
282Free tokens: # ; ~ $ ^
283
284// To be used:
285 @ at: associative lists.
286
287// Idea: use # instead of @ for attributes and allow then at declaration level.
288// This will open up the possibility of defining #! at the start of a file
289// without requiring special syntax. Although probably not quite.
290 -->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100291
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +0100292
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100293### Integer literals
294
295An integer literal is a sequence of digits representing an integer value.
Marcel van Lohuizenb2703c62019-09-29 18:20:01 +0200296An optional prefix sets a non-decimal base: 0o for octal,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002970x or 0X for hexadecimal, and 0b for binary.
298In hexadecimal literals, letters a-f and A-F represent values 10 through 15.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500299All integers allow interstitial underscores "_";
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100300these have no meaning and are solely for readability.
301
302Decimal integers may have a SI or IEC multiplier.
303Multipliers can be used with fractional numbers.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500304When multiplying a fraction by a multiplier, the result is truncated
305towards zero if it is not an integer.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100306
307```
Marcel van Lohuizenafb4db62019-05-31 00:23:24 +0200308int_lit = decimal_lit | si_lit | octal_lit | binary_lit | hex_lit .
309decimal_lit = ( "1" … "9" ) { [ "_" ] decimal_digit } .
310decimals = decimal_digit { [ "_" ] decimal_digit } .
311si_it = decimals [ "." decimals ] multiplier |
312 "." decimals multiplier .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100313binary_lit = "0b" binary_digit { binary_digit } .
314hex_lit = "0" ( "x" | "X" ) hex_digit { [ "_" ] hex_digit } .
Marcel van Lohuizenb2703c62019-09-29 18:20:01 +0200315octal_lit = "0o" octal_digit { [ "_" ] octal_digit } .
Marcel van Lohuizen6eefcd02019-10-04 13:32:06 +0200316multiplier = ( "K" | "M" | "G" | "T" | "P" ) [ "i" ]
Marcel van Lohuizenafb4db62019-05-31 00:23:24 +0200317
318float_lit = decimals "." [ decimals ] [ exponent ] |
319 decimals exponent |
320 "." decimals [ exponent ].
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +0200321exponent = ( "e" | "E" ) [ "+" | "-" ] decimals .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100322```
Marcel van Lohuizen6eefcd02019-10-04 13:32:06 +0200323<!--
324TODO: consider allowing Exo (and up), if not followed by a sign
325or number. Alternatively one could only allow Ei, Yi, and Zi.
326-->
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +0100327
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100328```
32942
3301.5Gi
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100331170_141_183_460_469_231_731_687_303_715_884_105_727
Marcel van Lohuizenfc6303c2019-02-07 17:49:04 +01003320xBad_Face
3330o755
3340b0101_0001
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100335```
336
337### Decimal floating-point literals
338
339A decimal floating-point literal is a representation of
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500340a decimal floating-point value (a _float_).
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100341It has an integer part, a decimal point, a fractional part, and an
342exponent part.
343The integer and fractional part comprise decimal digits; the
344exponent part is an `e` or `E` followed by an optionally signed decimal exponent.
345One of the integer part or the fractional part may be elided; one of the decimal
346point or the exponent may be elided.
347
348```
349decimal_lit = decimals "." [ decimals ] [ exponent ] |
350 decimals exponent |
351 "." decimals [ exponent ] .
352exponent = ( "e" | "E" ) [ "+" | "-" ] decimals .
353```
354
355```
3560.
35772.40
358072.40 // == 72.40
3592.71828
3601.e+0
3616.67428e-11
3621E6
363.25
364.12345E+5
365```
366
367
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100368### String and byte sequence literals
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100369
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100370A string literal represents a string constant obtained from concatenating a
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100371sequence of characters.
372Byte sequences are a sequence of bytes.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100373
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100374String and byte sequence literals are character sequences between,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100375respectively, double and single quotes, as in `"bar"` and `'bar'`.
376Within the quotes, any character may appear except newline and,
377respectively, unescaped double or single quote.
378String literals may only be valid UTF-8.
379Byte sequences may contain any sequence of bytes.
380
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400381Several escape sequences allow arbitrary values to be encoded as ASCII text.
382An escape sequence starts with an _escape delimiter_, which is `\` by default.
383The escape delimiter may be altered to be `\` plus a fixed number of
384hash symbols `#`
385by padding the start and end of a string or byte sequence literal
386with this number of hash symbols.
387
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100388There are four ways to represent the integer value as a numeric constant: `\x`
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400389followed by exactly two hexadecimal digits; `\u` followed by exactly four
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100390hexadecimal digits; `\U` followed by exactly eight hexadecimal digits, and a
391plain backslash `\` followed by exactly three octal digits.
392In each case the value of the literal is the value represented by the
393digits in the corresponding base.
394Hexadecimal and octal escapes are only allowed within byte sequences
395(single quotes).
396
397Although these representations all result in an integer, they have different
398valid ranges.
399Octal escapes must represent a value between 0 and 255 inclusive.
400Hexadecimal escapes satisfy this condition by construction.
401The escapes `\u` and `\U` represent Unicode code points so within them
402some values are illegal, in particular those above `0x10FFFF`.
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400403Surrogate halves are allowed,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100404but are translated into their non-surrogate equivalent internally.
405
406The three-digit octal (`\nnn`) and two-digit hexadecimal (`\xnn`) escapes
407represent individual bytes of the resulting string; all other escapes represent
408the (possibly multi-byte) UTF-8 encoding of individual characters.
409Thus inside a string literal `\377` and `\xFF` represent a single byte of
410value `0xFF=255`, while `ÿ`, `\u00FF`, `\U000000FF` and `\xc3\xbf` represent
411the two bytes `0xc3 0xbf` of the UTF-8
412encoding of character `U+00FF`.
413
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100414```
415\a U+0007 alert or bell
416\b U+0008 backspace
417\f U+000C form feed
418\n U+000A line feed or newline
419\r U+000D carriage return
420\t U+0009 horizontal tab
421\v U+000b vertical tab
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100422\/ U+002f slash (solidus)
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100423\\ U+005c backslash
424\' U+0027 single quote (valid escape only within single quoted literals)
425\" U+0022 double quote (valid escape only within double quoted literals)
426```
427
428The escape `\(` is used as an escape for string interpolation.
429A `\(` must be followed by a valid CUE Expression, followed by a `)`.
430
431All other sequences starting with a backslash are illegal inside literals.
432
433```
Marcel van Lohuizen39df6c92019-10-25 20:16:26 +0200434escaped_char = `\` { `#` } ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | "/" | `\` | "'" | `"` ) .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100435byte_value = octal_byte_value | hex_byte_value .
436octal_byte_value = `\` octal_digit octal_digit octal_digit .
437hex_byte_value = `\` "x" hex_digit hex_digit .
438little_u_value = `\` "u" hex_digit hex_digit hex_digit hex_digit .
439big_u_value = `\` "U" hex_digit hex_digit hex_digit hex_digit
440 hex_digit hex_digit hex_digit hex_digit .
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400441unicode_value = unicode_char | little_u_value | big_u_value | escaped_char .
442interpolation = "\(" Expression ")" .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100443
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400444string_lit = simple_string_lit |
445 multiline_string_lit |
446 simple_bytes_lit |
447 multiline_bytes_lit |
448 `#` string_lit `#` .
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100449
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400450simple_string_lit = `"` { unicode_value | interpolation } `"` .
Marcel van Lohuizenc6e5d172019-11-22 12:09:25 -0800451simple_bytes_lit = `'` { unicode_value | interpolation | byte_value } `'` .
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400452multiline_string_lit = `"""` newline
453 { unicode_value | interpolation | newline }
454 newline `"""` .
455multiline_bytes_lit = "'''" newline
456 { unicode_value | interpolation | byte_value | newline }
457 newline "'''" .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100458```
459
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400460Carriage return characters (`\r`) inside string literals are discarded from
Marcel van Lohuizendb9d25a2019-02-21 23:54:43 +0100461the string value.
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400462
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100463```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100464'a\000\xab'
465'\007'
466'\377'
467'\xa' // illegal: too few hexadecimal digits
468"\n"
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +0100469"\""
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100470'Hello, world!\n'
471"Hello, \( name )!"
472"日本語"
473"\u65e5本\U00008a9e"
474"\xff\u00FF"
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +0100475"\uD800" // illegal: surrogate half (TODO: probably should allow)
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100476"\U00110000" // illegal: invalid Unicode code point
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400477
478#"This is not an \(interpolation)"#
479#"This is an \#(interpolation)"#
480#"The sequence "\U0001F604" renders as \#U0001F604."#
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100481```
482
483These examples all represent the same string:
484
485```
486"日本語" // UTF-8 input text
487'日本語' // UTF-8 input text as byte sequence
488`日本語` // UTF-8 input text as a raw literal
489"\u65e5\u672c\u8a9e" // the explicit Unicode code points
490"\U000065e5\U0000672c\U00008a9e" // the explicit Unicode code points
491"\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e" // the explicit UTF-8 bytes
492```
493
494If the source code represents a character as two code points, such as a
495combining form involving an accent and a letter, the result will appear as two
496code points if placed in a string literal.
497
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400498Strings and byte sequences have a multiline equivalent.
499Multiline strings are like their single-line equivalent,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100500but allow newline characters.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100501
Marcel van Lohuizen369e4232019-02-15 10:59:29 +0400502Multiline strings and byte sequences respectively start with
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100503a triple double quote (`"""`) or triple single quote (`'''`),
504immediately followed by a newline, which is discarded from the string contents.
505The string is closed by a matching triple quote, which must be by itself
506on a newline, preceded by optional whitespace.
Marcel van Lohuizenc8d6c392019-12-02 13:30:47 +0100507The newline preceding the closing quote is discarded from the string contents.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100508The whitespace before a closing triple quote must appear before any non-empty
509line after the opening quote and will be removed from each of these
510lines in the string literal.
511A closing triple quote may not appear in the string.
512To include it is suffices to escape one of the quotes.
513
514```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100515"""
516 lily:
517 out of the water
518 out of itself
519
520 bass
521 picking bugs
522 off the moon
523 — Nick Virgilio, Selected Haiku, 1988
524 """
525```
526
527This represents the same string as:
528
529```
530"lily:\nout of the water\nout of itself\n\n" +
531"bass\npicking bugs\noff the moon\n" +
532" — Nick Virgilio, Selected Haiku, 1988"
533```
534
535<!-- TODO: other values
536
537Support for other values:
538- Duration literals
Marcel van Lohuizen75cb0032019-01-11 12:10:48 +0100539- regular expessions: `re("[a-z]")`
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100540-->
541
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500542
543## Values
544
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100545In addition to simple values like `"hello"` and `42.0`, CUE has _structs_.
546A struct is a map from labels to values, like `{a: 42.0, b: "hello"}`.
547Structs are CUE's only way of building up complex values;
548lists, which we will see later,
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500549are defined in terms of structs.
550
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100551All possible values are ordered in a lattice,
552a partial order where every two elements have a single greatest lower bound.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500553A value `a` is an _instance_ of a value `b`,
554denoted `a ⊑ b`, if `b == a` or `b` is more general than `a`,
555that is if `a` orders before `b` in the partial order
556(`⊑` is _not_ a CUE operator).
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100557We also say that `b` _subsumes_ `a` in this case.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500558In graphical terms, `b` is "above" `a` in the lattice.
559
560At the top of the lattice is the single ancestor of all values, called
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100561_top_, denoted `_` in CUE.
562Every value is an instance of top.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500563
564At the bottom of the lattice is the value called _bottom_, denoted `_|_`.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100565A bottom value usually indicates an error.
566Bottom is an instance of every value.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500567
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100568An _atom_ is any value whose only instances are itself and bottom.
569Examples of atoms are `42.0`, `"hello"`, `true`, `null`.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500570
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100571A value is _concrete_ if it is either an atom, or a struct all of whose
572field values are themselves concrete, recursively.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500573
574CUE's values also include what we normally think of as types, like `string` and
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100575`float`.
576But CUE does not distinguish between types and values; only the
577relationship of values in the lattice is important.
578Each CUE "type" subsumes the concrete values that one would normally think
579of as part of that type.
580For example, "hello" is an instance of `string`, and `42.0` is an instance of
581`float`.
582In addition to `string` and `float`, CUE has `null`, `int`, `bool` and `bytes`.
583We informally call these CUE's "basic types".
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100584
585
586```
587false ⊑ bool
588true ⊑ bool
589true ⊑ true
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01005905.0 ⊑ float
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100591bool ⊑ _
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100592_|_ ⊑ _
593_|_ ⊑ _|_
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100594
Marcel van Lohuizen6f0faec2018-12-16 10:42:42 +0100595_ ⋢ _|_
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100596_ ⋢ bool
597int ⋢ bool
598bool ⋢ int
599false ⋢ true
600true ⋢ false
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100601float ⋢ 5.0
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01006025 ⋢ 6
603```
604
605
606### Unification
607
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500608The _unification_ of values `a` and `b`
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100609is defined as the greatest lower bound of `a` and `b`. (That is, the
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500610value `u` such that `u ⊑ a` and `u ⊑ b`,
611and for any other value `v` for which `v ⊑ a` and `v ⊑ b`
612it holds that `v ⊑ u`.)
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500613Since CUE values form a lattice, the unification of two CUE values is
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400614always unique.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100615
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500616These all follow from the definition of unification:
617- The unification of `a` with itself is always `a`.
618- The unification of values `a` and `b` where `a ⊑ b` is always `a`.
619- The unification of a value with bottom is always bottom.
620
621Unification in CUE is a [binary expression](#Operands), written `a & b`.
622It is commutative and associative.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100623As a consequence, order of evaluation is irrelevant, a property that is key
624to many of the constructs in the CUE language as well as the tooling layered
625on top of it.
626
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500627
628
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100629<!-- TODO: explicitly mention that disjunction is not a binary operation
630but a definition of a single value?-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100631
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100632
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100633### Disjunction
634
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500635The _disjunction_ of values `a` and `b`
636is defined as the least upper bound of `a` and `b`.
637(That is, the value `d` such that `a ⊑ d` and `b ⊑ d`,
638and for any other value `e` for which `a ⊑ e` and `b ⊑ e`,
639it holds that `d ⊑ e`.)
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100640This style of disjunctions is sometimes also referred to as sum types.
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500641Since CUE values form a lattice, the disjunction of two CUE values is always unique.
642
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100643
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500644These all follow from the definition of disjunction:
645- The disjunction of `a` with itself is always `a`.
646- The disjunction of a value `a` and `b` where `a ⊑ b` is always `b`.
647- The disjunction of a value `a` with bottom is always `a`.
648- The disjunction of two bottom values is bottom.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100649
Jonathan Amsterdama8d8a3c2019-02-03 07:53:55 -0500650Disjunction in CUE is a [binary expression](#Operands), written `a | b`.
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100651It is commutative, associative, and idempotent.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100652
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100653The unification of a disjunction with another value is equal to the disjunction
654composed of the unification of this value with all of the original elements
655of the disjunction.
656In other words, unification distributes over disjunction.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100657
658```
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100659(a_0 | ... |a_n) & b ==> a_0&b | ... | a_n&b.
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100660```
661
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100662```
663Expression Result
664({a:1} | {b:2}) & {c:3} {a:1, c:3} | {b:2, c:3}
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100665(int | string) & "foo" "foo"
666("a" | "b") & "c" _|_
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100667```
668
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100669A disjunction is _normalized_ if there is no element
670`a` for which there is an element `b` such that `a ⊑ b`.
671
672<!--
673Normalization is important, as we need to account for spurious elements
674For instance "tcp" | "tcp" should resolve to "tcp".
675
676Also consider
677
678 ({a:1} | {b:1}) & ({a:1} | {b:2}) -> {a:1} | {a:1,b:1} | {a:1,b:2},
679
680in this case, elements {a:1,b:1} and {a:1,b:2} are subsumed by {a:1} and thus
681this expression is logically equivalent to {a:1} and should therefore be
682considered to be unambiguous and resolve to {a:1} if a concrete value is needed.
683
684For instance, in
685
686 x: ({a:1} | {b:1}) & ({a:1} | {b:2}) // -> {a:1} | {a:1,b:1} | {a:1,b:2}
687 y: x.a // 1
688
689y should resolve to 1, and not an error.
690
691For comparison, in
692
693 x: ({a:1, b:1} | {b:2}) & {a:1} // -> {a:1,b:1} | {a:1,b:2}
694 y: x.a // _|_
695
696y should be an error as x is still ambiguous before the selector is applied,
697even though `a` resolves to 1 in all cases.
698-->
699
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500700
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100701#### Default values
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500702
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100703Any element of a disjunction can be marked as a default
Axel Wagner8529d772019-09-24 18:27:12 +0000704by prefixing it with an asterisk `*`.
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100705Intuitively, when an expression needs to be resolved for an operation other
706than unification or disjunctions,
707non-starred elements are dropped in favor of starred ones if the starred ones
708do not resolve to bottom.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500709
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100710More precisely, any value `v` may be associated with a default value `d`,
711denoted `(v, d)` (not CUE syntax),
712where `d` must be in instance of `v` (`d ⊑ v`).
713The rules for unifying and disjoining such values are as follows:
714
715```
716U1: (v1, d1) & v2 => (v1&v2, d1&v2)
717U2: (v1, d1) & (v2, d2) => (v1&v2, d1&d2)
718
719D1: (v1, d1) | v2 => (v1|v2, d1)
720D2: (v1, d1) | (v2, d2) => (v1|v2, d1|d2)
721```
722
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100723Default values may be introduced within disjunctions
724by _marking_ terms of a disjunction with an asterisk `*`
725([a unary expression](#Operators)).
726The default value of a disjunction with marked terms is the disjunction
727of those marked terms, applying the following rules for marks:
728
729```
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +0200730M1: *v => (v, v)
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100731M2: *(v1, d1) => (v1, d1)
732```
733
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400734In general, any operation `f` in CUE involving default values proceeds along the
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +0200735following lines
736```
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400737O1: f((v1, d1), ..., (vn, dn)) => (f(v1, ..., vn), f(d1, ..., dn))
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +0200738```
739where, with the exception of disjunction, a value `v` without a default
740value is promoted to `(v, v)`.
741
742
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100743```
744Expression Value-default pair Rules applied
745*"tcp" | "udp" ("tcp"|"udp", "tcp") M1, D1
746string | *"foo" (string, "foo") M1, D1
747
748*1 | 2 | 3 (1|2|3, 1) M1, D1
749
750(*1|2|3) | (1|*2|3) (1|2|3, 1|2) M1, D1, D2
751(*1|2|3) | *(1|*2|3) (1|2|3, 1|2) M1, D1, M2, D2
752(*1|2|3) | (1|*2|3)&2 (1|2|3, 1|2) M1, D1, U1, D2
753
754(*1|2) & (1|*2) (1|2, _|_) M1, D1, U2
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +0200755
756(*1|2) + (1|*2) ((1|2)+(1|2), 3) M1, D1, O1
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100757```
758
759The rules of subsumption for defaults can be derived from the above definitions
760and are as follows.
761
762```
763(v2, d2) ⊑ (v1, d1) if v2 ⊑ v1 and d2 ⊑ d1
764(v1, d1) ⊑ v if v1 ⊑ v
765v ⊑ (v1, d1) if v ⊑ d1
766```
767
768<!--
769For the second rule, note that by definition d1 ⊑ v1, so d1 ⊑ v1 ⊑ v.
770
771The last one is so restrictive as v could still be made more specific by
772associating it with a default that is not subsumed by d1.
773
774Proof:
775 by definition for any d ⊑ v, it holds that (v, d) ⊑ v,
776 where the most general value is (v, v).
777 Given the subsumption rule for (v2, d2) ⊑ (v1, d1),
778 from (v, v) ⊑ v ⊑ (v1, d1) it follows that v ⊑ d1
779 exactly defines the boundary of this subsumption.
780-->
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100781
782<!--
783(non-normalized entries could also be implicitly marked, allowing writing
784int | 1, instead of int | *1, but that can be done in a backwards
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100785compatible way later if really desirable, as long as we require that
786disjunction literals be normalized).
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500787-->
788
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100789
790```
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100791Expression Resolves to
792"tcp" | "udp" "tcp" | "udp"
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100793*"tcp" | "udp" "tcp"
794float | *1 1
795*string | 1.0 string
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100796
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100797(*1|2|3) | (1|*2|3) 1|2
798(*1|2|3) & (1|*2|3) 1|2|3 // default is _|_
799
800(* >=5 | int) & (* <=5 | int) 5
801
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100802(*"tcp"|"udp") & ("udp"|*"tcp") "tcp"
803(*"tcp"|"udp") & ("udp"|"tcp") "tcp"
804(*"tcp"|"udp") & "tcp" "tcp"
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100805(*"tcp"|"udp") & (*"udp"|"tcp") "tcp" | "udp" // default is _|_
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100806
807(*true | false) & bool true
808(*true | false) & (true | false) true
809
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100810{a: 1} | {b: 1} {a: 1} | {b: 1}
Marcel van Lohuizen69139d62019-01-24 13:46:51 +0100811{a: 1} | *{b: 1} {b:1}
Marcel van Lohuizen6e5d9932019-03-14 15:52:48 +0100812*{a: 1} | *{b: 1} {a: 1} | {b: 1}
813({a: 1} | {b: 1}) & {a:1} {a:1} // after eliminating {a:1,b:1} by normalization
814({a:1}|*{b:1}) & ({a:1}|*{b:1}) {b:1} // after eliminating {a:1,b:1} by normalization
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100815```
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500816
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100817
818### Bottom and errors
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100819
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100820Any evaluation error in CUE results in a bottom value, respresented by
Axel Wagner8529d772019-09-24 18:27:12 +0000821the token `_|_`.
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100822Bottom is an instance of every other value.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100823Any evaluation error is represented as bottom.
824
825Implementations may associate error strings with different instances of bottom;
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500826logically they all remain the same value.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100827
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100828
829### Top
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100830
Axel Wagner8529d772019-09-24 18:27:12 +0000831Top is represented by the underscore character `_`, lexically an identifier.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100832Unifying any value `v` with top results `v` itself.
833
834```
835Expr Result
836_ & 5 5
837_ & _ _
838_ & _|_ _|_
839_ | _|_ _
840```
841
842
843### Null
844
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100845The _null value_ is represented with the keyword `null`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100846It has only one parent, top, and one child, bottom.
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100847It is unordered with respect to any other value.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100848
849```
850null_lit = "null"
851```
852
853```
Marcel van Lohuizen6f0faec2018-12-16 10:42:42 +0100854null & 8 _|_
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100855null & _ null
856null & _|_ _|_
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100857```
858
859
860### Boolean values
861
862A _boolean type_ represents the set of Boolean truth values denoted by
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100863the keywords `true` and `false`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100864The predeclared boolean type is `bool`; it is a defined type and a separate
865element in the lattice.
866
867```
868boolean_lit = "true" | "false"
869```
870
871```
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100872bool & true true
873true & true true
874true & false _|_
875bool & (false|true) false | true
876bool & (true|false) true | false
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100877```
878
879
880### Numeric values
881
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500882The _integer type_ represents the set of all integral numbers.
883The _decimal floating-point type_ represents the set of all decimal floating-point
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100884numbers.
885They are two distinct types.
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +0200886Both are instances instances of a generic `number` type.
887
888<!--
889 number
890 / \
891 int float
892-->
893
894The predeclared number, integer, decimal floating-point types are
895`number`, `int` and `float`; they are defined types.
896<!--
897TODO: should we drop float? It is somewhat preciser and probably a good idea
898to have it in the programmatic API, but it may be confusing to have to deal
899with it in the language.
900-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100901
902A decimal floating-point literal always has type `float`;
903it is not an instance of `int` even if it is an integral number.
904
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400905Integer literals are always of type `int` and don't match type `float`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100906
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100907Numeric literals are exact values of arbitrary precision.
908If the operation permits it, numbers should be kept in arbitrary precision.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100909
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100910Implementation restriction: although numeric values have arbitrary precision
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100911in the language, implementations may implement them using an internal
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100912representation with limited precision.
913That said, every implementation must:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100914
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500915- Represent integer values with at least 256 bits.
916- Represent floating-point values, with a mantissa of at least 256 bits and
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100917a signed binary exponent of at least 16 bits.
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500918- Give an error if unable to represent an integer value precisely.
919- Give an error if unable to represent a floating-point value due to overflow.
920- Round to the nearest representable value if unable to represent
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100921a floating-point value due to limits on precision.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100922These requirements apply to the result of any expression except for builtin
923functions for which an unusual loss of precision must be explicitly documented.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100924
925
926### Strings
927
Marcel van Lohuizen4108f802019-08-13 18:30:25 +0200928The _string type_ represents the set of UTF-8 strings,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100929not allowing surrogates.
930The predeclared string type is `string`; it is a defined type.
931
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100932The length of a string `s` (its size in bytes) can be discovered using
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400933the built-in function `len`.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100934
Marcel van Lohuizen4108f802019-08-13 18:30:25 +0200935
936### Bytes
937
938The _bytes type_ represents the set of byte sequences.
939A byte sequence value is a (possibly empty) sequence of bytes.
940The number of bytes is called the length of the byte sequence
941and is never negative.
942The predeclared byte sequence type is `bytes`; it is a defined type.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100943
944
Marcel van Lohuizen7da140a2019-02-01 09:35:00 +0100945### Bounds
946
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400947A _bound_, syntactically a [unary expression](#Operands), defines
Marcel van Lohuizen62b87272019-02-01 10:07:49 +0100948an infinite disjunction of concrete values than can be represented
Marcel van Lohuizen7da140a2019-02-01 09:35:00 +0100949as a single comparison.
950
951For any [comparison operator](#Comparison-operators) `op` except `==`,
952`op a` is the disjunction of every `x` such that `x op a`.
953
954```
9552 & >=2 & <=5 // 2, where 2 is either an int or float.
9562.5 & >=1 & <=5 // 2.5
9572 & >=1.0 & <3.0 // 2.0
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01009582 & >1 & <3.0 // 2.0
Marcel van Lohuizen7da140a2019-02-01 09:35:00 +01009592.5 & int & >1 & <5 // _|_
9602.5 & float & >1 & <5 // 2.5
961int & 2 & >1.0 & <3.0 // _|_
9622.5 & >=(int & 1) & <5 // _|_
963>=0 & <=7 & >=3 & <=10 // >=3 & <=7
964!=null & 1 // 1
965>=5 & <=5 // 5
966```
967
968
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100969### Structs
970
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500971A _struct_ is a set of elements called _fields_, each of
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100972which has a name, called a _label_, and value.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100973
974We say a label is defined for a struct if the struct has a field with the
975corresponding label.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +0200976The value for a label `f` of struct `a` is denoted `a.f`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100977A struct `a` is an instance of `b`, or `a ⊑ b`, if for any label `f`
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100978defined for `b`, label `f` is also defined for `a` and `a.f ⊑ b.f`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +0100979Note that if `a` is an instance of `b` it may have fields with labels that
980are not defined for `b`.
981
Jonathan Amsterdame4790382019-01-20 10:29:29 -0500982The (unique) struct with no fields, written `{}`, has every struct as an
983instance. It can be considered the type of all structs.
984
Jonathan Amsterdam061bde12019-09-03 08:28:10 -0400985```
986{a: 1} ⊑ {}
987{a: 1, b: 1} ⊑ {a: 1}
988{a: 1} ⊑ {a: int}
989{a: 1, b: 1} ⊑ {a: int, b: float}
990
991{} ⋢ {a: 1}
992{a: 2} ⋢ {a: 1}
993{a: 1} ⋢ {b: 1}
994```
995
Marcel van Lohuizen62658a82019-06-16 12:18:47 +0200996A field may be required or optional.
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +0100997The successful unification of structs `a` and `b` is a new struct `c` which
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +0100998has all fields of both `a` and `b`, where
999the value of a field `f` in `c` is `a.f & b.f` if `f` is in both `a` and `b`,
1000or just `a.f` or `b.f` if `f` is in just `a` or `b`, respectively.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001001If a field `f` is in both `a` and `b`, `c.f` is optional only if both
1002`a.f` and `b.f` are optional.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001003Any [references](#References) to `a` or `b`
1004in their respective field values need to be replaced with references to `c`.
Marcel van Lohuizen3022ae92019-10-15 13:35:58 +02001005The result of a unification is bottom (`_|_`) if any of its non-optional
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001006fields evaluates to bottom, recursively.
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02001007
Marcel van Lohuizen5134dee2019-07-21 14:41:44 +02001008<!--NOTE: About bottom values for optional fields being okay.
1009
1010The proposition ¬P is a close cousin of P → ⊥ and is often used
1011as an approximation to avoid the issues of using not.
1012Bottom (⊥) is also frequently used to mean undefined. This makes sense.
1013Consider `{a?: 2} & {a?: 3}`.
1014Both structs say `a` is optional; in other words, it may be omitted.
1015So we can still get a valid result by omitting `a`, even in
1016case of a conflict.
1017
1018Granted, this definition may lead to confusing results, especially in
1019definitions, when tightening an optional field leads to unintentionally
1020discarding it.
1021It could be a role of vet checkers to identify such cases (and suggest users
1022to explicitly use `_|_` to discard a field, for instance).
1023-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001024
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001025Syntactically, a struct literal may contain multiple fields with
1026the same label, the result of which is a single field with the same properties
1027as defined as the unification of two fields resulting from unifying two structs.
1028
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001029These examples illustrate required fields only.
1030Examples with optional fields follow below.
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001031
1032```
1033Expression Result (without optional fields)
1034{a: int, a: 1} {a: 1}
1035{a: int} & {a: 1} {a: 1}
1036{a: >=1 & <=7} & {a: >=5 & <=9} {a: >=5 & <=7}
1037{a: >=1 & <=7, a: >=5 & <=9} {a: >=5 & <=7}
1038
1039{a: 1} & {b: 2} {a: 1, b: 2}
1040{a: 1, b: int} & {b: 2} {a: 1, b: 2}
1041
1042{a: 1} & {a: 2} _|_
1043```
1044
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001045Optional labels are defined in sets with an expression to select all
1046labels to which to apply a given constraint.
1047Syntactically, the label of an optional field set is an expression in square
1048brackets indicating the matching labels.
1049The value `string` matches all fields, while a concrete string matches a
1050single field.
1051As the latter case is common, a concrete label followed by
1052a question mark `?` may be used as a shorthand.
Marcel van Lohuizen0cb140e2020-02-10 09:09:43 +01001053So
1054```
1055foo?: bar
1056```
1057is a shorthand for
1058```
1059["foo"]: bar
1060```
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001061The question mark is not part of the field name.
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001062The token `...` may be used as the last declaration in a struct
Marcel van Lohuizen0cb140e2020-02-10 09:09:43 +01001063and is a shorthand for
1064```
1065[_]: _
1066```
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001067
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001068Concrete field labels may be an identifier or string, the latter of which may be
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001069interpolated.
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001070Fields with identifier labels can be referred to within the scope they are
1071defined, string labels cannot.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001072References within such interpolated strings are resolved within
1073the scope of the struct in which the label sequence is
1074defined and can reference concrete labels lexically preceding
1075the label within a label sequence.
1076<!-- We allow this so that rewriting a CUE file to collapse or expand
1077field sequences has no impact on semantics.
1078-->
1079
1080<!--TODO: first implementation round will not yet have expression labels
1081
1082An ExpressionLabel sets a collection of optional fields to a field value.
1083By default it defines this value for all possible string labels.
1084An optional expression limits this to the set of optional fields which
1085labels match the expression.
1086-->
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001087
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001088
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001089<!-- NOTE: if we allow ...Expr, as in list, it would mean something different. -->
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001090
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001091
1092<!-- NOTE:
1093A DefinitionDecl does not allow repeated labels. This is to avoid
1094any ambiguity or confusion about whether earlier path components
1095are to be interpreted as declarations or normal fields (they should
1096always be normal fields.)
1097-->
1098
1099<!--NOTE:
1100The syntax has been deliberately restricted to allow for the following
1101future extensions and relaxations:
1102 - Allow omitting a "?" in an expression label to indicate a concrete
1103 string value (but maybe we want to use () for that).
1104 - Make the "?" in expression label optional if expression labels
1105 are always optional.
1106 - Or allow eliding the "?" if the expression has no references and
1107 is obviously not concrete (such as `[string]`).
1108 - The expression of an expression label may also indicate a struct with
1109 integer or even number labels
1110 (beware of imprecise computation in the latter).
1111 e.g. `{ [int]: string }` is a map of integers to strings.
1112 - Allow for associative lists (`foo [@.field]: {field: string}`)
1113 - The `...` notation can be extended analogously to that of a ListList,
1114 by allowing it to follow with an expression for the remaining properties.
1115 In that case it is no longer a shorthand for `[string]: _`, but rather
1116 would define the value for any other value for which there is no field
1117 defined.
1118 Like the definition with List, this is somewhat odd, but it allows the
1119 encoding of JSON schema's and (non-structural) OpenAPI's
1120 additionalProperties and additionalItems.
1121-->
1122
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001123```
Marcel van Lohuizen1f5a9032019-09-09 23:53:42 +02001124StructLit = "{" { Declaration "," } [ "..." ] "}" .
Marcel van Lohuizen4dd96302020-01-13 09:38:00 +01001125Declaration = Field | Comprehension | AliasExpr | attribute .
1126Field = LabelSpec { LabelSpec } Expression { attribute } .
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001127LabelSpec = Label ( ":" | "::" ) .
1128Label = LabelName [ "?" ] | "[" AliasExpr "]".
1129LabelName = identifier | simple_string_lit .
Marcel van Lohuizenb9b62d32019-03-14 23:50:15 +01001130
Marcel van Lohuizen4d29dde2019-12-02 23:11:30 +01001131attribute = "@" identifier "(" attr_tokens ")" .
1132attr_tokens = { attr_token |
1133 "(" attr_tokens ")" |
1134 "[" attr_tokens "]" |
1135 "{" attr_tokens "}" } .
1136attr_token = /* any token except '(', ')', '[', ']', '{', or '}' */
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001137```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001138
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001139<!--
1140 TODO: Label = LabelName [ "?" ] | "[" AliasExpr "]" | "(" AliasExpr ")"
1141-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001142
1143```
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001144Expression Result (without optional fields)
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001145a: { foo?: string } {}
1146b: { foo: "bar" } { foo: "bar" }
1147c: { foo?: *"bar" | string } {}
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001148
1149d: a & b { foo: "bar" }
1150e: b & c { foo: "bar" }
1151f: a & c {}
1152g: a & { foo?: number } {}
1153h: b & { foo?: number } _|_
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001154i: c & { foo: string } { foo: "bar" }
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001155
1156intMap: [string]: int
1157intMap: {
1158 t1: 43
1159 t2: 2.4 // error: 2.4 is not an integer
1160}
1161
1162nameMap: [string]: {
1163 firstName: string
1164 nickName: *firstName | string
1165}
1166
1167nameMap: hank: { firstName: "Hank" }
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001168```
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001169The optional field set defined by `nameMap` matches every field,
1170in this case just `hank`, and unifies the associated constraint
1171with the matched field, resulting in:
1172```
1173nameMap: hank: {
1174 firstName: "Hank"
1175 nickName: "Hank"
1176}
1177```
1178
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001179
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001180#### Closed structs
1181
1182By default, structs are open to adding fields.
Marcel van Lohuizen5134dee2019-07-21 14:41:44 +02001183Instances of an open struct `p` may contain fields not defined in `p`.
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001184This is makes it easy to add fields, but can lead to bugs:
1185
1186```
1187S: {
1188 field1: string
1189}
1190
1191S1: S & { field2: "foo" }
1192
1193// S1 is { field1: string, field2: "foo" }
1194
1195
1196A: {
1197 field1: string
1198 field2: string
1199}
1200
1201A1: A & {
1202 feild1: "foo" // "field1" was accidentally misspelled
1203}
1204
1205// A1 is
1206// { field1: string, field2: string, feild1: "foo" }
1207// not the intended
1208// { field1: "foo", field2: string }
1209```
1210
Marcel van Lohuizen18637db2019-09-03 11:48:25 +02001211A _closed struct_ `c` is a struct whose instances may not have regular fields
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001212not defined in `c`.
Marcel van Lohuizen4245fb42019-09-09 11:22:12 +02001213Closing a struct is equivalent to adding an optional field with value `_|_`
Marcel van Lohuizen5134dee2019-07-21 14:41:44 +02001214for all undefined fields.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001215
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001216Syntactically, closed structs can be explicitly created with the `close` builtin
1217or implicitly by [definitions](#Definitions).
1218
1219
1220```
1221A: close({
1222 field1: string
1223 field2: string
1224})
1225
1226A1: A & {
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001227 feild1: string
1228} // _|_ feild1 not defined for A
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001229
1230A2: A & {
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001231 for k,v in { feild1: string } {
1232 k: v
1233 }
1234} // _|_ feild1 not defined for A
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001235
1236C: close({
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001237 [_]: _
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001238})
1239
1240C2: C & {
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001241 for k,v in { thisIsFine: string } {
1242 "\(k)": v
1243 }
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001244}
1245
1246D: close({
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001247 // Values generated by comprehensions are treated as embeddings.
1248 for k,v in { x: string } {
1249 "\(k)": v
1250 }
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001251})
1252```
1253
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001254<!-- (jba) Somewhere it should be said that optional fields are only
1255 interesting inside closed structs. -->
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001256
1257#### Embedding
1258
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001259A struct may contain an _embedded value_, an operand used
Marcel van Lohuizen5134dee2019-07-21 14:41:44 +02001260as a declaration, which must evaluate to a struct.
1261An embedded value of type struct is unified with the struct in which it is
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001262embedded, but disregarding the restrictions imposed by closed structs.
1263A struct resulting from such a unification is closed if either of the involved
1264structs were closed.
1265
Marcel van Lohuizena3c7bef2019-10-10 21:50:58 +02001266At the top level, an embedded value may be any type.
1267In this case, a CUE program will evaluate to the embedded value
1268and the CUE program may not have top-level regular or optional
1269fields (definitions and aliases are allowed).
1270
Marcel van Lohuizene53305e2019-09-13 10:10:31 +02001271Syntactically, embeddings may be any expression, except that `<`
1272is eagerly interpreted as a bind label.
Marcel van Lohuizen1f5a9032019-09-09 23:53:42 +02001273
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001274```
1275S1: {
1276 a: 1
1277 b: 2
1278 {
1279 c: 3
1280 }
1281}
1282// S1 is { a: 1, b: 2, c: 3 }
1283
1284S2: close({
1285 a: 1
1286 b: 2
1287 {
1288 c: 3
1289 }
1290})
1291// same as close(S1)
1292
1293S3: {
1294 a: 1
1295 b: 2
1296 close({
1297 c: 3
1298 })
1299}
1300// same as S2
1301```
1302
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001303
1304#### Definitions
1305
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001306A field of a struct may be declared as a regular field (using `:`)
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001307or as a _definition_ (using `::`).
1308Definitions are not emitted as part of the model and are never required
1309to be concrete when emitting data.
Marcel van Lohuizen18637db2019-09-03 11:48:25 +02001310It is illegal to have a regular field and a definition with the same name
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001311within the same struct.
Marcel van Lohuizenfa7e3ce2019-10-10 15:43:34 +02001312Literal structs that are part of a definition's value are implicitly closed,
1313but may unify unrestricted with other structs within the field's declaration.
Marcel van Lohuizen5e8c3912019-09-03 15:46:26 +02001314This excludes literals structs in embeddings and aliases.
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02001315
Marcel van Lohuizenfa7e3ce2019-10-10 15:43:34 +02001316<!--
1317This may be a more intuitive definition:
1318 Literal structs that are part of a definition's value are implicitly closed.
1319 Implicitly closed literal structs that are unified within
1320 a single field declaration are considered to be a single literal struct.
1321However, this would make unification non-commutative, unless one imposes an
1322ordering where literal structs are unified before unifying them with others.
1323Imposing such an ordering is complex and error prone.
1324-->
Marcel van Lohuizen5134dee2019-07-21 14:41:44 +02001325An ellipsis `...` in such literal structs keeps them open,
1326as it defines `_` for all labels.
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02001327
Marcel van Lohuizen5e8c3912019-09-03 15:46:26 +02001328<!--
1329Excluding embeddings from recursive closing allows comprehensions to be
1330interpreted as embeddings without some exception. For instance,
1331 if x > 2 {
1332 foo: string
1333 }
1334should not cause any failure. It is also consistent with embeddings being
1335opened when included in a closed struct.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001336
Marcel van Lohuizen5e8c3912019-09-03 15:46:26 +02001337Finally, excluding embeddings from recursive closing allows for
1338a mechanism to not recursively close, without needing an additional language
1339construct, such as a triple colon or something else:
1340foo :: {
1341 {
1342 // not recursively closed
1343 }
1344 ... // include this to not close outer struct
1345}
1346
1347Including aliases from this exclusion, which are more a separate definition
1348than embedding seems sensible, and allows for an easy mechanism to avoid
1349closing, aside from embedding.
1350-->
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001351
1352```
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001353MyStruct :: {
Marcel van Lohuizenfa7e3ce2019-10-10 15:43:34 +02001354 sub field: string
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001355}
1356
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001357MyStruct :: {
Marcel van Lohuizenfa7e3ce2019-10-10 15:43:34 +02001358 sub enabled?: bool
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001359}
1360
1361myValue: MyStruct & {
Marcel van Lohuizenfa7e3ce2019-10-10 15:43:34 +02001362 sub feild: 2 // error, feild not defined in MyStruct
1363 sub enabled: true // okay
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001364}
1365
1366D :: {
1367 OneOf
1368
1369 c: int // adds this field.
1370}
1371
1372OneOf :: { a: int } | { b: int }
1373
1374
1375D1: D & { a: 12, c: 22 } // { a: 12, c: 22 }
1376D2: D & { a: 12, b: 33 } // _|_ // cannot define both `a` and `b`
1377```
1378
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001379
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001380<!---
1381JSON fields are usual camelCase. Clashes can be avoided by adopting the
1382convention that definitions be TitleCase. Unexported definitions are still
1383subject to clashes, but those are likely easier to resolve because they are
1384package internal.
1385--->
1386
1387
Marcel van Lohuizen4dd96302020-01-13 09:38:00 +01001388#### Attributes
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001389
Marcel van Lohuizen4dd96302020-01-13 09:38:00 +01001390Attributes allow associating meta information with values.
1391Their primary purpose is to define mappings between CUE and
1392other representations.
1393Attributes do not influence the evaluation of CUE.
1394
1395An attribute associates an identifier with a value, a balanced token sequence,
1396which is a sequence of CUE tokens with balanced brackets (`()`, `[]`, and `{}`).
1397The sequence may not contain interpolations.
1398
1399Fields, structs and packages can be associated with a set of attributes.
1400Attributes accumulate during unification, but implementations may remove
1401duplicates that have the same source string representation.
1402The interpretation of an attribute, including the handling of multiple
1403attributes for a given identifier, is up to the consumer of the attribute.
1404
1405Field attributes define additional information about a field,
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001406such as a mapping to a protocol buffer <!-- TODO: add link --> tag or alternative
Marcel van Lohuizenb9b62d32019-03-14 23:50:15 +01001407name of the field when mapping to a different language.
1408
Marcel van Lohuizenb9b62d32019-03-14 23:50:15 +01001409
1410```
Marcel van Lohuizen4dd96302020-01-13 09:38:00 +01001411// Package attribute
1412@protobuf(proto3)
1413
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001414myStruct1: {
Marcel van Lohuizen4dd96302020-01-13 09:38:00 +01001415 // Struct attribute:
1416 @jsonschema(id="https://example.org/mystruct1.json")
1417
1418 // Field attributes
Marcel van Lohuizenb9b62d32019-03-14 23:50:15 +01001419 field: string @go(Field)
1420 attr: int @xml(,attr) @go(Attr)
1421}
1422
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001423myStruct2: {
Marcel van Lohuizenb9b62d32019-03-14 23:50:15 +01001424 field: string @go(Field)
1425 attr: int @xml(a1,attr) @go(Attr)
1426}
1427
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001428Combined: myStruct1 & myStruct2
Marcel van Lohuizenb9b62d32019-03-14 23:50:15 +01001429// field: string @go(Field)
1430// attr: int @xml(,attr) @xml(a1,attr) @go(Attr)
1431```
1432
Marcel van Lohuizenfa7e3ce2019-10-10 15:43:34 +02001433
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001434#### Aliases
1435
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01001436Aliases name values that can be referred to
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001437within the [scope](#declarations-and-scopes) in which they are declared.
1438The name of an alias must be unique within its scope.
1439
1440```
1441AliasExpr = identifier "=" Expression | Expression .
1442```
1443
1444Aliases can appear in several positions:
1445
1446As a declaration in a struct (`X=expr`):
1447
1448- binds the value to an identifier without including it in the struct.
1449
1450In front of a Label (`X=label: value`):
1451
1452- binds the identifier to the same value as `label` would be bound
1453 to if it were a valid identifier.
1454- for optional fields (`foo?: bar` and `[foo]: bar`),
1455 the bound identifier is only visible within the field value (`value`).
1456
1457Inside a bracketed label (`[X=expr]: value`):
1458
1459- binds the identifier to the the concrete label that matches `expr`
1460 within the instances of the field value (`value`).
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001461
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001462<!-- TODO: explain the difference between aliases and definitions.
1463 Now that you have definitions, are aliases really necessary?
1464 Consider removing.
1465-->
1466
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001467```
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001468// An alias declaration
1469Alias = 3
1470a: Alias // 3
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001471
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001472// A field alias
1473foo: X // 4
1474X="not an identifier": 4
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001475
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001476// A label alias
1477[Y=string]: { name: Y }
1478foo: { value: 1 } // outputs: foo: { name: "foo", value: 1 }
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001479```
1480
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001481<!-- TODO: also allow aliases as lists -->
1482
1483
Jonathan Amsterdam061bde12019-09-03 08:28:10 -04001484#### Shorthand notation for nested structs
1485
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001486A field whose value is a struct with a single field may be written as
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001487a colon-separated sequence of the two field names,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001488followed by a colon and the value of that single field.
1489
1490```
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001491job: myTask: replicas: 2
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001492```
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001493expands to
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001494```
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001495job: {
1496 myTask: {
1497 replicas: 2
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001498 }
1499}
1500```
1501
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001502<!-- OPTIONAL FIELDS:
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001503
Marcel van Lohuizen08a0ef22019-03-28 09:12:19 +01001504The optional marker solves the issue of having to print large amounts of
1505boilerplate when dealing with large types with many optional or default
1506values (such as Kubernetes).
1507Writing such optional values in terms of *null | value is tedious,
1508unpleasant to read, and as it is not well defined what can be dropped or not,
1509all null values have to be emitted from the output, even if the user
1510doesn't override them.
1511Part of the issue is how null is defined. We could adopt a Typescript-like
1512approach of introducing "void" or "undefined" to mean "not defined and not
1513part of the output". But having all of null, undefined, and void can be
1514confusing. If these ever are introduced anyway, the ? operator could be
1515expressed along the lines of
1516 foo?: bar
1517being a shorthand for
1518 foo: void | bar
1519where void is the default if no other default is given.
1520
1521The current mechanical definition of "?" is straightforward, though, and
1522probably avoids the need for void, while solving a big issue.
1523
1524Caveats:
1525[1] this definition requires explicitly defined fields to be emitted, even
1526if they could be elided (for instance if the explicit value is the default
1527value defined an optional field). This is probably a good thing.
1528
1529[2] a default value may still need to be included in an output if it is not
1530the zero value for that field and it is not known if any outside system is
1531aware of defaults. For instance, which defaults are specified by the user
1532and which by the schema understood by the receiving system.
1533The use of "?" together with defaults should therefore be used carefully
1534in non-schema definitions.
1535Problematic cases should be easy to detect by a vet-like check, though.
1536
1537[3] It should be considered how this affects the trim command.
1538Should values implied by optional fields be allowed to be removed?
1539Probably not. This restriction is unlikely to limit the usefulness of trim,
1540though.
1541
1542[4] There should be an option to emit all concrete optional values.
1543```
1544-->
1545
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001546### Lists
1547
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01001548A list literal defines a new value of type list.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001549A list may be open or closed.
1550An open list is indicated with a `...` at the end of an element list,
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01001551optionally followed by a value for the remaining elements.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001552
1553The length of a closed list is the number of elements it contains.
1554The length of an open list is the its number of elements as a lower bound
1555and an unlimited number of elements as its upper bound.
1556
1557```
Marcel van Lohuizen2b0e7cd2019-03-25 08:28:41 +01001558ListLit = "[" [ ElementList [ "," [ "..." [ Expression ] ] ] "]" .
1559ElementList = Expression { "," Expression } .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001560```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001561
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001562Lists can be thought of as structs:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001563
1564```
Marcel van Lohuizen08466f82019-02-01 09:09:09 +01001565List: *null | {
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001566 Elem: _
1567 Tail: List
1568}
1569```
1570
1571For closed lists, `Tail` is `null` for the last element, for open lists it is
Marcel van Lohuizen08466f82019-02-01 09:09:09 +01001572`*null | List`, defaulting to the shortest variant.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001573For instance, the open list [ 1, 2, ... ] can be represented as:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001574```
1575open: List & { Elem: 1, Tail: { Elem: 2 } }
1576```
1577and the closed version of this list, [ 1, 2 ], as
1578```
1579closed: List & { Elem: 1, Tail: { Elem: 2, Tail: null } }
1580```
1581
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001582Using this representation, the subsumption rule for lists can
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001583be derived from those of structs.
1584Implementations are not required to implement lists as structs.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001585The `Elem` and `Tail` fields are not special and `len` will not work as
1586expected in these cases.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001587
1588
1589## Declarations and Scopes
1590
1591
1592### Blocks
1593
1594A _block_ is a possibly empty sequence of declarations.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001595The braces of a struct literal `{ ... }` form a block, but there are
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001596others as well:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001597
Marcel van Lohuizen75cb0032019-01-11 12:10:48 +01001598- The _universe block_ encompasses all CUE source text.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001599- Each [package](#modules-instances-and-packages) has a _package block_
1600 containing all CUE source text in that package.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001601- Each file has a _file block_ containing all CUE source text in that file.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001602- Each `for` and `let` clause in a [comprehension](#comprehensions)
1603 is considered to be its own implicit block.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001604
1605Blocks nest and influence [scoping].
1606
1607
1608### Declarations and scope
1609
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001610A _declaration_ may bind an identifier to a field, alias, or package.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001611Every identifier in a program must be declared.
1612Other than for fields,
1613no identifier may be declared twice within the same block.
1614For fields an identifier may be declared more than once within the same block,
1615resulting in a field with a value that is the result of unifying the values
1616of all fields with the same identifier.
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001617String labels do not bind an identifier to the respective field.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001618
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001619The _scope_ of a declared identifier is the extent of source text in which the
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001620identifier denotes the specified field, alias, or package.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001621
1622CUE is lexically scoped using blocks:
1623
Jonathan Amsterdame4790382019-01-20 10:29:29 -050016241. The scope of a [predeclared identifier](#predeclared-identifiers) is the universe block.
Marcel van Lohuizen21f6c442019-09-26 14:55:23 +020016251. The scope of an identifier denoting a field
1626 declared at top level (outside any struct literal) is the package block.
16271. The scope of an identifier denoting an alias
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001628 declared at top level (outside any struct literal) is the file block.
16291. The scope of the package name of an imported package is the file block of the
1630 file containing the import declaration.
16311. The scope of a field or alias identifier declared inside a struct literal
1632 is the innermost containing block.
1633
1634An identifier declared in a block may be redeclared in an inner block.
1635While the identifier of the inner declaration is in scope, it denotes the entity
1636declared by the inner declaration.
1637
1638The package clause is not a declaration;
Jonathan Amsterdame4790382019-01-20 10:29:29 -05001639the package name does not appear in any scope.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001640Its purpose is to identify the files belonging to the same package
Marcel van Lohuizen75cb0032019-01-11 12:10:48 +01001641and to specify the default name for import declarations.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001642
1643
1644### Predeclared identifiers
1645
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001646CUE predefines a set of types and builtin functions.
1647For each of these there is a corresponding keyword which is the name
1648of the predefined identifier, prefixed with `__`.
1649
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001650```
1651Functions
1652len required close open
1653
1654Types
1655null The null type and value
1656bool All boolean values
1657int All integral numbers
1658float All decimal floating-point numbers
1659string Any valid UTF-8 sequence
Marcel van Lohuizen4108f802019-08-13 18:30:25 +02001660bytes Any valid byte sequence
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001661
1662Derived Value
1663number int | float
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01001664uint >=0
1665uint8 >=0 & <=255
1666int8 >=-128 & <=127
1667uint16 >=0 & <=65536
1668int16 >=-32_768 & <=32_767
1669rune >=0 & <=0x10FFFF
1670uint32 >=0 & <=4_294_967_296
1671int32 >=-2_147_483_648 & <=2_147_483_647
1672uint64 >=0 & <=18_446_744_073_709_551_615
1673int64 >=-9_223_372_036_854_775_808 & <=9_223_372_036_854_775_807
1674uint128 >=0 & <=340_282_366_920_938_463_463_374_607_431_768_211_455
1675int128 >=-170_141_183_460_469_231_731_687_303_715_884_105_728 &
1676 <=170_141_183_460_469_231_731_687_303_715_884_105_727
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02001677float32 >=-3.40282346638528859811704183484516925440e+38 &
1678 <=3.40282346638528859811704183484516925440e+38
1679float64 >=-1.797693134862315708145274237317043567981e+308 &
1680 <=1.797693134862315708145274237317043567981e+308
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001681```
1682
1683
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001684### Exported identifiers
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001685
1686An identifier of a package may be exported to permit access to it
1687from another package.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001688An identifier is exported if
1689the first character of the identifier's name is a Unicode upper case letter
1690(Unicode class "Lu"); and
1691the identifier is declared in the file block.
1692All other top-level identifiers used for fields not exported.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001693
1694In addition, any definition declared anywhere within a package of which
1695the first character of the identifier's name is a Unicode upper case letter
1696(Unicode class "Lu") is visible outside this package.
1697Any other defintion is not visible outside the package and resides
1698in a separate namespace than namesake identifiers of other packages.
1699This is in contrast to ordinary field declarations that do not begin with
1700an upper-case letter, which are visible outside the package.
1701
1702```
1703package mypackage
1704
1705foo: string // not visible outside mypackage
1706
Marcel van Lohuizen21f6c442019-09-26 14:55:23 +02001707Foo :: { // visible outside mypackage
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001708 a: 1 // visible outside mypackage
1709 B: 2 // visible outside mypackage
1710
1711 C :: { // visible outside mypackage
1712 d: 4 // visible outside mypackage
1713 }
1714 e :: foo // not visible outside mypackage
1715}
1716```
1717
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001718
1719### Uniqueness of identifiers
1720
1721Given a set of identifiers, an identifier is called unique if it is different
1722from every other in the set, after applying normalization following
1723Unicode Annex #31.
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001724Two identifiers are different if they are spelled differently
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001725or if they appear in different packages and are not exported.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001726Otherwise, they are the same.
1727
1728
1729### Field declarations
1730
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001731A field associates the value of an expression to a label within a struct.
1732If this label is an identifier, it binds the field to that identifier,
1733so the field's value can be referenced by writing the identifier.
1734String labels are not bound to fields.
1735```
1736a: {
1737 b: 2
1738 "s": 3
1739
1740 c: b // 2
1741 d: s // _|_ unresolved identifier "s"
1742 e: a.s // 3
1743}
1744```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001745
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001746If an expression may result in a value associated with a default value
1747as described in [default values](#default-values), the field binds to this
1748value-default pair.
1749
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001750
Marcel van Lohuizenbcf832f2019-04-03 22:50:44 +02001751<!-- TODO: disallow creating identifiers starting with __
1752...and reserve them for builtin values.
1753
1754The issue is with code generation. As no guarantee can be given that
1755a predeclared identifier is not overridden in one of the enclosing scopes,
1756code will have to handle detecting such cases and renaming them.
1757An alternative is to have the predeclared identifiers be aliases for namesake
1758equivalents starting with a double underscore (e.g. string -> __string),
1759allowing generated code (normal code would keep using `string`) to refer
1760to these directly.
1761-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001762
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001763
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001764### Alias declarations
1765
1766An alias declaration binds an identifier to the given expression.
1767
1768Within the scope of the identifier, it serves as an _alias_ for that
1769expression.
Marcel van Lohuizen40178752019-08-25 19:17:56 +02001770The expression is evaluated in the scope it was declared.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001771
1772
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001773## Expressions
1774
1775An expression specifies the computation of a value by applying operators and
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001776built-in functions to operands.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001777
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001778Expressions that require concrete values are called _incomplete_ if any of
1779their operands are not concrete, but define a value that would be legal for
1780that expression.
1781Incomplete expressions may be left unevaluated until a concrete value is
1782requested at the application level.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001783
1784### Operands
1785
1786Operands denote the elementary values in an expression.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01001787An operand may be a literal, a (possibly qualified) identifier denoting
1788field, alias, or a parenthesized expression.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001789
1790```
1791Operand = Literal | OperandName | ListComprehension | "(" Expression ")" .
1792Literal = BasicLit | ListLit | StructLit .
1793BasicLit = int_lit | float_lit | string_lit |
1794 null_lit | bool_lit | bottom_lit | top_lit .
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001795OperandName = identifier | QualifiedIdent .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001796```
1797
1798### Qualified identifiers
1799
1800A qualified identifier is an identifier qualified with a package name prefix.
1801
1802```
1803QualifiedIdent = PackageName "." identifier .
1804```
1805
1806A qualified identifier accesses an identifier in a different package,
1807which must be [imported].
1808The identifier must be declared in the [package block] of that package.
1809
1810```
1811math.Sin // denotes the Sin function in package math
1812```
1813
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001814### References
1815
1816An identifier operand refers to a field and is called a reference.
1817The value of a reference is a copy of the expression associated with the field
1818that it is bound to,
1819with any references within that expression bound to the respective copies of
1820the fields they were originally bound to.
1821Implementations may use a different mechanism to evaluate as long as
1822these semantics are maintained.
1823
1824```
1825a: {
1826 place: string
1827 greeting: "Hello, \(place)!"
1828}
1829
1830b: a & { place: "world" }
1831c: a & { place: "you" }
1832
1833d: b.greeting // "Hello, world!"
1834e: c.greeting // "Hello, you!"
1835```
1836
1837
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001838
1839### Primary expressions
1840
1841Primary expressions are the operands for unary and binary expressions.
1842
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001843
1844```
1845
1846Slice: indices must be complete
1847([0, 1, 2, 3] | [2, 3])[0:2] => [0, 1] | [2, 3]
1848
1849([0, 1, 2, 3] | *[2, 3])[0:2] => [0, 1] | [2, 3]
1850([0,1,2,3]|[2,3], [2,3])[0:2] => ([0,1]|[2,3], [2,3])
1851
1852Index
1853a: (1|2, 1)
1854b: ([0,1,2,3]|[2,3], [2,3])[a] => ([0,1,2,3]|[2,3][a], 3)
1855
1856Binary operation
1857A binary is only evaluated if its operands are complete.
1858
1859Input Maximum allowed evaluation
1860a: string string
1861b: 2 2
1862c: a * b a * 2
1863
1864An error in a struct is if the evaluation of any expression results in
1865bottom, where an incomplete expression is not considered bottom.
1866```
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +01001867<!-- TODO(mpvl)
1868 Conversion |
1869-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001870```
1871PrimaryExpr =
1872 Operand |
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001873 PrimaryExpr Selector |
1874 PrimaryExpr Index |
1875 PrimaryExpr Slice |
1876 PrimaryExpr Arguments .
1877
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001878Selector = "." (identifier | simple_string_lit) .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001879Index = "[" Expression "]" .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001880Argument = Expression .
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02001881Arguments = "(" [ ( Argument { "," Argument } ) [ "," ] ] ")" .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001882```
1883<!---
Marcel van Lohuizen9ffcbbc2019-10-23 18:05:05 +02001884TODO:
1885 PrimaryExpr Query |
1886Query = "." Filters .
1887Filters = Filter { Filter } .
1888Filter = "[" [ "?" ] AliasExpr "]" .
1889
1890TODO: maybe reintroduce slices, as they are useful in queries, probably this
1891time with Python semantics.
1892Slice = "[" [ Expression ] ":" [ Expression ] [ ":" [Expression] ] "]" .
1893
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001894Argument = Expression | ( identifer ":" Expression ).
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001895
1896// & expression type
1897// string_lit: same as label. Arguments is current node.
1898// If selector is applied to list, it performs the operation for each
1899// element.
1900
1901TODO: considering allowing decimal_lit for selectors.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001902--->
1903
1904```
1905x
19062
1907(s + ".txt")
1908f(3.1415, true)
1909m["foo"]
1910s[i : j + 1]
1911obj.color
1912f.p[i].x
1913```
1914
1915
1916### Selectors
1917
Roger Peppeded0e1d2019-09-24 16:39:36 +01001918For a [primary expression](#primary-expressions) `x` that is not a [package name](#package-clause),
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001919the selector expression
1920
1921```
1922x.f
1923```
1924
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001925denotes the element of a <!--list or -->struct `x` identified by `f`.
1926<!--For structs, -->`f` must be an identifier or a string literal identifying
1927any definition or regular non-optional field.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001928The identifier `f` is called the field selector.
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001929
1930<!--
1931Allowing strings to be used as field selectors obviates the need for
1932backquoted identifiers. Note that some standards use names for structs that
1933are not standard identifiers (such "Fn::Foo"). Note that indexing does not
1934allow access to identifiers.
1935-->
1936
1937<!--
1938For lists, `f` must be an integer and follows the same lookup rules as
1939for the index operation.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001940The type of the selector expression is the type of `f`.
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001941-->
1942
Roger Peppeded0e1d2019-09-24 16:39:36 +01001943If `x` is a package name, see the section on [qualified identifiers](#qualified-identifiers).
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001944
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001945<!--
1946TODO: consider allowing this and also for selectors. It needs to be considered
1947how defaults are corried forward in cases like:
1948
1949 x: { a: string | *"foo" } | *{ a: int | *4 }
1950 y: x.a & string
1951
1952What is y in this case?
1953 (x.a & string, _|_)
1954 (string|"foo", _|_)
1955 (string|"foo", "foo)
1956If the latter, then why?
1957
1958For a disjunction of the form `x1 | ... | xn`,
1959the selector is applied to each element `x1.f | ... | xn.f`.
1960-->
1961
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001962Otherwise, if `x` is not a <!--list or -->struct,
1963or if `f` does not exist in `x`,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001964the result of the expression is bottom (an error).
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001965In the latter case the expression is incomplete.
1966The operand of a selector may be associated with a default.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001967
1968```
1969T: {
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001970 x: int
1971 y: 3
1972 "x-y": 4
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001973}
1974
Marcel van Lohuizenc7791ac2019-10-07 11:29:28 +02001975a: T.x // int
1976b: T.y // 3
1977c: T.z // _|_ // field 'z' not found in T
1978d: T."x-y" // 4
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001979
1980e: {a: 1|*2} | *{a: 3|*4}
1981f: e.a // 4 (default value)
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001982```
1983
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02001984<!--
1985```
1986(v, d).f => (v.f, d.f)
1987
1988e: {a: 1|*2} | *{a: 3|*4}
1989f: e.a // 4 after selecting default from (({a: 1|*2} | {a: 3|*4}).a, 4)
1990
1991```
1992-->
1993
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01001994
1995### Index expressions
1996
1997A primary expression of the form
1998
1999```
2000a[x]
2001```
2002
Marcel van Lohuizen4108f802019-08-13 18:30:25 +02002003denotes the element of a list or struct `a` indexed by `x`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002004The value `x` is called the index or field name, respectively.
2005The following rules apply:
2006
2007If `a` is not a struct:
2008
Marcel van Lohuizen4108f802019-08-13 18:30:25 +02002009- `a` is a list (which need not be complete)
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02002010- the index `x` unified with `int` must be concrete.
2011- the index `x` is in range if `0 <= x < len(a)`, where only the
2012 explicitly defined values of an open-ended list are considered,
2013 otherwise it is out of range
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002014
2015The result of `a[x]` is
2016
Marcel van Lohuizen4108f802019-08-13 18:30:25 +02002017for `a` of list type:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002018
Marcel van Lohuizen4108f802019-08-13 18:30:25 +02002019- the list element at index `x`, if `x` is within range
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002020- bottom (an error), otherwise
2021
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002022
2023for `a` of struct type:
2024
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02002025- the index `x` unified with `string` must be concrete.
Marcel van Lohuizend2825532019-09-23 12:44:01 +01002026- the value of the regular and non-optional field named `x` of struct `a`,
2027 if this field exists
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002028- bottom (an error), otherwise
2029
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02002030
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002031```
2032[ 1, 2 ][1] // 2
Marcel van Lohuizen6f0faec2018-12-16 10:42:42 +01002033[ 1, 2 ][2] // _|_
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01002034[ 1, 2, ...][2] // _|_
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002035```
2036
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02002037Both the operand and index value may be a value-default pair.
2038```
2039va[vi] => va[vi]
2040va[(vi, di)] => (va[vi], va[di])
2041(va, da)[vi] => (va[vi], da[vi])
2042(va, da)[(vi, di)] => (va[vi], da[di])
2043```
2044
2045```
2046Fields Result
2047x: [1, 2] | *[3, 4] ([1,2]|[3,4], [3,4])
2048i: int | *1 (int, 1)
2049
2050v: x[i] (x[i], 4)
2051```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002052
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002053### Operators
2054
2055Operators combine operands into expressions.
2056
2057```
2058Expression = UnaryExpr | Expression binary_op Expression .
2059UnaryExpr = PrimaryExpr | unary_op UnaryExpr .
2060
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01002061binary_op = "|" | "&" | "||" | "&&" | "==" | rel_op | add_op | mul_op .
Marcel van Lohuizen2b0e7cd2019-03-25 08:28:41 +01002062rel_op = "!=" | "<" | "<=" | ">" | ">=" | "=~" | "!~" .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002063add_op = "+" | "-" .
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002064mul_op = "*" | "/" | "div" | "mod" | "quo" | "rem" .
Marcel van Lohuizen7da140a2019-02-01 09:35:00 +01002065unary_op = "+" | "-" | "!" | "*" | rel_op .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002066```
2067
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002068Comparisons are discussed [elsewhere](#Comparison-operators).
Marcel van Lohuizen7da140a2019-02-01 09:35:00 +01002069For any binary operators, the operand types must unify.
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002070
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002071<!-- TODO: durations
2072 unless the operation involves durations.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002073
2074Except for duration operations, if one operand is an untyped [literal] and the
2075other operand is not, the constant is [converted] to the type of the other
2076operand.
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002077-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002078
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02002079Operands of unary and binary expressions may be associated with a default using
2080the following
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002081
Marcel van Lohuizenfe4abac2019-04-06 17:19:03 +02002082<!--
2083```
2084O1: op (v1, d1) => (op v1, op d1)
2085
2086O2: (v1, d1) op (v2, d2) => (v1 op v2, d1 op d2)
2087and because v => (v, v)
2088O3: v1 op (v2, d2) => (v1 op v2, v1 op d2)
2089O4: (v1, d1) op v2 => (v1 op v2, d1 op v2)
2090```
2091-->
2092
2093```
2094Field Resulting Value-Default pair
2095a: *1|2 (1|2, 1)
2096b: -a (-a, -1)
2097
2098c: a + 2 (a+2, 3)
2099d: a + a (a+a, 2)
2100```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002101
2102#### Operator precedence
2103
2104Unary operators have the highest precedence.
2105
2106There are eight precedence levels for binary operators.
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01002107Multiplication operators binds strongest, followed by
2108addition operators, comparison operators,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002109`&&` (logical AND), `||` (logical OR), `&` (unification),
2110and finally `|` (disjunction):
2111
2112```
2113Precedence Operator
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002114 7 * / div mod quo rem
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002115 6 + -
Marcel van Lohuizen2b0e7cd2019-03-25 08:28:41 +01002116 5 == != < <= > >= =~ !~
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002117 4 &&
2118 3 ||
2119 2 &
2120 1 |
2121```
2122
2123Binary operators of the same precedence associate from left to right.
2124For instance, `x / y * z` is the same as `(x / y) * z`.
2125
2126```
2127+x
212823 + 3*x[i]
2129x <= f()
2130f() || g()
2131x == y+1 && y == z-1
21322 | int
2133{ a: 1 } & { b: 2 }
2134```
2135
2136#### Arithmetic operators
2137
2138Arithmetic operators apply to numeric values and yield a result of the same type
2139as the first operand. The three of the four standard arithmetic operators
2140`(+, -, *)` apply to integer and decimal floating-point types;
Marcel van Lohuizen1e0fe9c2018-12-21 00:17:06 +01002141`+` and `*` also apply to lists and strings.
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002142`/` only applies to decimal floating-point types and
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002143`div`, `mod`, `quo`, and `rem` only apply to integer types.
2144
2145```
Marcel van Lohuizen08466f82019-02-01 09:09:09 +01002146+ sum integers, floats, lists, strings, bytes
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002147- difference integers, floats
Marcel van Lohuizen08466f82019-02-01 09:09:09 +01002148* product integers, floats, lists, strings, bytes
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002149/ quotient floats
2150div division integers
2151mod modulo integers
2152quo quotient integers
2153rem remainder integers
2154```
2155
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002156For any operator that accepts operands of type `float`, any operand may be
2157of type `int` or `float`, in which case the result will be `float` if any
2158of the operands is `float` or `int` otherwise.
2159For `/` the result is always `float`.
2160
2161
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002162#### Integer operators
2163
2164For two integer values `x` and `y`,
2165the integer quotient `q = x div y` and remainder `r = x mod y `
Marcel van Lohuizen75cb0032019-01-11 12:10:48 +01002166implement Euclidean division and
2167satisfy the following relationship:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002168
2169```
2170r = x - y*q with 0 <= r < |y|
2171```
2172where `|y|` denotes the absolute value of `y`.
2173
2174```
2175 x y x div y x mod y
2176 5 3 1 2
2177-5 3 -2 1
2178 5 -3 -1 2
2179-5 -3 2 1
2180```
2181
2182For two integer values `x` and `y`,
2183the integer quotient `q = x quo y` and remainder `r = x rem y `
Marcel van Lohuizen75cb0032019-01-11 12:10:48 +01002184implement truncated division and
2185satisfy the following relationship:
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002186
2187```
2188x = q*y + r and |r| < |y|
2189```
2190
2191with `x quo y` truncated towards zero.
2192
2193```
2194 x y x quo y x rem y
2195 5 3 1 2
2196-5 3 -1 -2
2197 5 -3 -1 2
2198-5 -3 1 -2
2199```
2200
2201A zero divisor in either case results in bottom (an error).
2202
2203For integer operands, the unary operators `+` and `-` are defined as follows:
2204
2205```
2206+x is 0 + x
2207-x negation is 0 - x
2208```
2209
2210
2211#### Decimal floating-point operators
2212
2213For decimal floating-point numbers, `+x` is the same as `x`,
2214while -x is the negation of x.
2215The result of a floating-point division by zero is bottom (an error).
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002216
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002217<!-- TODO: consider making it +/- Inf -->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002218
2219An implementation may combine multiple floating-point operations into a single
2220fused operation, possibly across statements, and produce a result that differs
2221from the value obtained by executing and rounding the instructions individually.
2222
2223
2224#### List operators
2225
2226Lists can be concatenated using the `+` operator.
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002227Opens list are closed to their default value beforehand.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002228
2229```
2230[ 1, 2 ] + [ 3, 4 ] // [ 1, 2, 3, 4 ]
2231[ 1, 2, ... ] + [ 3, 4 ] // [ 1, 2, 3, 4 ]
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002232[ 1, 2 ] + [ 3, 4, ... ] // [ 1, 2, 3, 4 ]
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002233```
2234
Jonathan Amsterdam0500c312019-02-16 18:04:09 -05002235Lists can be multiplied with a non-negative`int` using the `*` operator
Marcel van Lohuizen13e36bd2019-02-01 09:59:18 +01002236to create a repeated the list by the indicated number.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002237```
22383*[1,2] // [1, 2, 1, 2, 1, 2]
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +020022393*[1, 2, ...] // [1, 2, 1, 2, 1 ,2]
Marcel van Lohuizen13e36bd2019-02-01 09:59:18 +01002240[byte]*4 // [byte, byte, byte, byte]
Jonathan Amsterdam0500c312019-02-16 18:04:09 -050022410*[1,2] // []
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002242```
Marcel van Lohuizen08466f82019-02-01 09:09:09 +01002243
2244<!-- TODO(mpvl): should we allow multiplication with a range?
2245If so, how does one specify a list with a range of possible lengths?
2246
2247Suggestion from jba:
2248Multiplication should distribute over disjunction,
2249so int(1)..int(3) * [x] = [x] | [x, x] | [x, x, x].
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01002250The hard part is figuring out what (>=1 & <=3) * [x] means,
2251since >=1 & <=3 includes many floats.
Marcel van Lohuizen08466f82019-02-01 09:09:09 +01002252(mpvl: could constrain arguments to parameter types, but needs to be
2253done consistently.)
2254-->
2255
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002256
2257#### String operators
2258
2259Strings can be concatenated using the `+` operator:
2260```
Daniel Martí107863a2020-02-11 15:00:50 +00002261s: "hi " + name + " and good bye"
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002262```
2263String addition creates a new string by concatenating the operands.
2264
2265A string can be repeated by multiplying it:
2266
2267```
2268s: "etc. "*3 // "etc. etc. etc. "
2269```
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002270
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002271<!-- jba: Do these work for byte sequences? If not, why not? -->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002272
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002273
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002274##### Comparison operators
2275
2276Comparison operators compare two operands and yield an untyped boolean value.
2277
2278```
2279== equal
2280!= not equal
2281< less
2282<= less or equal
2283> greater
2284>= greater or equal
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002285=~ matches regular expression
2286!~ does not match regular expression
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002287```
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002288
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002289<!-- regular expression operator inspired by Bash, Perl, and Ruby. -->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002290
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002291In any comparison, the types of the two operands must unify or one of the
2292operands must be null.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002293
2294The equality operators `==` and `!=` apply to operands that are comparable.
2295The ordering operators `<`, `<=`, `>`, and `>=` apply to operands that are ordered.
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002296The matching operators `=~` and `!~` apply to a string and regular
2297expression operand.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002298These terms and the result of the comparisons are defined as follows:
2299
Marcel van Lohuizen855243e2019-02-07 18:00:55 +01002300- Null is comparable with itself and any other type.
2301 Two null values are always equal, null is unequal with anything else.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002302- Boolean values are comparable.
2303 Two boolean values are equal if they are either both true or both false.
2304- Integer values are comparable and ordered, in the usual way.
2305- Floating-point values are comparable and ordered, as per the definitions
2306 for binary coded decimals in the IEEE-754-2008 standard.
Marcel van Lohuizen4a360992019-05-11 18:18:31 +02002307- Floating point numbers may be compared with integers.
Marcel van Lohuizen4108f802019-08-13 18:30:25 +02002308- String and bytes values are comparable and ordered lexically byte-wise.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01002309- Struct are not comparable.
Marcel van Lohuizen855243e2019-02-07 18:00:55 +01002310- Lists are not comparable.
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002311- The regular expression syntax is the one accepted by RE2,
2312 described in https://github.com/google/re2/wiki/Syntax,
2313 except for `\C`.
2314- `s =~ r` is true if `s` matches the regular expression `r`.
2315- `s !~ r` is true if `s` does not match regular expression `r`.
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002316
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02002317<!--- TODO: consider the following
2318- For regular expression, named capture groups are interpreted as CUE references
2319 that must unify with the strings matching this capture group.
2320--->
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002321<!-- TODO: Implementations should adopt an algorithm that runs in linear time? -->
Marcel van Lohuizen88a8a5f2019-02-20 01:26:22 +01002322<!-- Consider implementing Level 2 of Unicode regular expression. -->
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002323
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002324```
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +010023253 < 4 // true
Marcel van Lohuizen4a360992019-05-11 18:18:31 +020023263 < 4.0 // true
Marcel van Lohuizen0a0a3ac2019-02-10 16:48:53 +01002327null == 2 // false
2328null != {} // true
2329{} == {} // _|_: structs are not comparable against structs
2330
2331"Wild cats" =~ "cat" // true
2332"Wild cats" !~ "dog" // true
2333
2334"foo" =~ "^[a-z]{3}$" // true
2335"foo" =~ "^[a-z]{4}$" // false
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002336```
2337
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002338<!-- jba
2339I think I know what `3 < a` should mean if
2340
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01002341 a: >=1 & <=5
2342
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002343It should be a constraint on `a` that can be evaluated once `a`'s value is known more precisely.
2344
Marcel van Lohuizen62b87272019-02-01 10:07:49 +01002345But what does `3 < (>=1 & <=5)` mean? We'll never get more information, so it must have a definite value.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002346-->
2347
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002348#### Logical operators
2349
2350Logical operators apply to boolean values and yield a result of the same type
2351as the operands. The right operand is evaluated conditionally.
2352
2353```
2354&& conditional AND p && q is "if p then q else false"
2355|| conditional OR p || q is "if p then true else q"
2356! NOT !p is "not p"
2357```
2358
2359
2360<!--
2361### TODO TODO TODO
2362
23633.14 / 0.0 // illegal: division by zero
2364Illegal conversions always apply to CUE.
2365
2366Implementation restriction: A compiler may use rounding while computing untyped floating-point or complex constant expressions; see the implementation restriction in the section on constants. This rounding may cause a floating-point constant expression to be invalid in an integer context, even if it would be integral when calculated using infinite precision, and vice versa.
2367-->
2368
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +01002369<!--- TODO(mpvl): conversions
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002370### Conversions
2371Conversions are expressions of the form `T(x)` where `T` and `x` are
2372expressions.
2373The result is always an instance of `T`.
2374
2375```
2376Conversion = Expression "(" Expression [ "," ] ")" .
2377```
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +01002378--->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002379<!---
2380
2381A literal value `x` can be converted to type T if `x` is representable by a
2382value of `T`.
2383
2384As a special case, an integer literal `x` can be converted to a string type
2385using the same rule as for non-constant x.
2386
2387Converting a literal yields a typed value as result.
2388
2389```
2390uint(iota) // iota value of type uint
2391float32(2.718281828) // 2.718281828 of type float32
2392complex128(1) // 1.0 + 0.0i of type complex128
2393float32(0.49999999) // 0.5 of type float32
2394float64(-1e-1000) // 0.0 of type float64
2395string('x') // "x" of type string
2396string(0x266c) // "♬" of type string
2397MyString("foo" + "bar") // "foobar" of type MyString
2398string([]byte{'a'}) // not a constant: []byte{'a'} is not a constant
2399(*int)(nil) // not a constant: nil is not a constant, *int is not a boolean, numeric, or string type
2400int(1.2) // illegal: 1.2 cannot be represented as an int
2401string(65.0) // illegal: 65.0 is not an integer constant
2402```
2403--->
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +01002404<!---
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002405
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002406A conversion is always allowed if `x` is an instance of `T`.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002407
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002408If `T` and `x` of different underlying type, a conversion is allowed if
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002409`x` can be converted to a value `x'` of `T`'s type, and
2410`x'` is an instance of `T`.
2411A value `x` can be converted to the type of `T` in any of these cases:
2412
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002413- `x` is a struct and is subsumed by `T`.
2414- `x` and `T` are both integer or floating points.
2415- `x` is an integer or a byte sequence and `T` is a string.
2416- `x` is a string and `T` is a byte sequence.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002417
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002418Specific rules apply to conversions between numeric types, structs,
2419or to and from a string type. These conversions may change the representation
2420of `x`.
2421All other conversions only change the type but not the representation of x.
2422
2423
2424#### Conversions between numeric ranges
2425For the conversion of numeric values, the following rules apply:
2426
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +010024271. Any integer value can be converted into any other integer value
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002428 provided that it is within range.
24292. When converting a decimal floating-point number to an integer, the fraction
2430 is discarded (truncation towards zero). TODO: or disallow truncating?
2431
2432```
2433a: uint16(int(1000)) // uint16(1000)
Marcel van Lohuizen6f0faec2018-12-16 10:42:42 +01002434b: uint8(1000) // _|_ // overflow
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002435c: int(2.5) // 2 TODO: TBD
2436```
2437
2438
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002439#### Conversions to and from a string type
2440
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002441Converting a list of bytes to a string type yields a string whose successive
2442bytes are the elements of the slice.
2443Invalid UTF-8 is converted to `"\uFFFD"`.
2444
2445```
2446string('hell\xc3\xb8') // "hellø"
2447string(bytes([0x20])) // " "
2448```
2449
2450As string value is always convertible to a list of bytes.
2451
2452```
2453bytes("hellø") // 'hell\xc3\xb8'
2454bytes("") // ''
2455```
2456
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002457#### Conversions between list types
2458
2459Conversions between list types are possible only if `T` strictly subsumes `x`
2460and the result will be the unification of `T` and `x`.
2461
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002462If we introduce named types this would be different from IP & [10, ...]
2463
2464Consider removing this until it has a different meaning.
2465
2466```
2467IP: 4*[byte]
2468Private10: IP([10, ...]) // [10, byte, byte, byte]
2469```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002470
Marcel van Lohuizen75cb0032019-01-11 12:10:48 +01002471#### Conversions between struct types
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002472
2473A conversion from `x` to `T`
2474is applied using the following rules:
2475
24761. `x` must be an instance of `T`,
24772. all fields defined for `x` that are not defined for `T` are removed from
2478 the result of the conversion, recursively.
2479
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002480<!-- jba: I don't think you say anywhere that the matching fields are unified.
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +01002481mpvl: they are not, x must be an instance of T, in which case x == T&x,
2482so unification would be unnecessary.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002483-->
Marcel van Lohuizena3f00972019-02-01 11:10:39 +01002484<!--
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002485```
2486T: {
2487 a: { b: 1..10 }
2488}
2489
2490x1: {
2491 a: { b: 8, c: 10 }
2492 d: 9
2493}
2494
2495c1: T(x1) // { a: { b: 8 } }
Marcel van Lohuizen6f0faec2018-12-16 10:42:42 +01002496c2: T({}) // _|_ // missing field 'a' in '{}'
2497c3: T({ a: {b: 0} }) // _|_ // field a.b does not unify (0 & 1..10)
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002498```
Marcel van Lohuizend340e8d2019-01-30 16:57:39 +01002499-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002500
2501### Calls
2502
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01002503Calls can be made to core library functions, called builtins.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002504Given an expression `f` of function type F,
2505```
2506f(a1, a2, … an)
2507```
2508calls `f` with arguments a1, a2, … an. Arguments must be expressions
2509of which the values are an instance of the parameter types of `F`
2510and are evaluated before the function is called.
2511
2512```
2513a: math.Atan2(x, y)
2514```
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002515
2516In a function call, the function value and arguments are evaluated in the usual
Marcel van Lohuizen1e0fe9c2018-12-21 00:17:06 +01002517order.
2518After they are evaluated, the parameters of the call are passed by value
2519to the function and the called function begins execution.
2520The return parameters
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002521of the function are passed by value back to the calling function when the
2522function returns.
2523
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002524
2525### Comprehensions
2526
Marcel van Lohuizen66db9202018-12-17 19:02:08 +01002527Lists and fields can be constructed using comprehensions.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002528
2529Each define a clause sequence that consists of a sequence of `for`, `if`, and
2530`let` clauses, nesting from left to right.
2531The `for` and `let` clauses each define a new scope in which new values are
2532bound to be available for the next clause.
2533
2534The `for` clause binds the defined identifiers, on each iteration, to the next
2535value of some iterable value in a new scope.
2536A `for` clause may bind one or two identifiers.
Marcel van Lohuizen4245fb42019-09-09 11:22:12 +02002537If there is one identifier, it binds it to the value of
2538a list element or struct field value.
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01002539If there are two identifiers, the first value will be the key or index,
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002540if available, and the second will be the value.
2541
Marcel van Lohuizen4245fb42019-09-09 11:22:12 +02002542For lists, `for` iterates over all elements in the list after closing it.
2543For structs, `for` iterates over all non-optional regular fields.
2544
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002545An `if` clause, or guard, specifies an expression that terminates the current
2546iteration if it evaluates to false.
2547
2548The `let` clause binds the result of an expression to the defined identifier
2549in a new scope.
2550
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002551A current iteration is said to complete if the innermost block of the clause
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002552sequence is reached.
2553
Marcel van Lohuizen5fee32f2019-01-21 22:18:48 +01002554_List comprehensions_ specify a single expression that is evaluated and included
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002555in the list for each completed iteration.
2556
Marcel van Lohuizen40178752019-08-25 19:17:56 +02002557_Field comprehensions_ follow a clause sequence with a struct literal,
2558where the struct literal is evaluated and embedded at the point of
2559declaration of the comprehension for each complete iteration.
2560As usual, fields in the struct may evaluate to the same label,
2561resulting in the unification of their values.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002562
2563```
Marcel van Lohuizen1f5a9032019-09-09 23:53:42 +02002564Comprehension = Clauses StructLit .
Marcel van Lohuizen40178752019-08-25 19:17:56 +02002565ListComprehension = "[" Expression Clauses "]" .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002566
2567Clauses = Clause { Clause } .
2568Clause = ForClause | GuardClause | LetClause .
2569ForClause = "for" identifier [ ", " identifier] "in" Expression .
2570GuardClause = "if" Expression .
2571LetClause = "let" identifier "=" Expression .
2572```
2573
2574```
2575a: [1, 2, 3, 4]
2576b: [ x+1 for x in a if x > 1] // [3, 4, 5]
2577
Marcel van Lohuizen40178752019-08-25 19:17:56 +02002578c: {
2579 for x in a
2580 if x < 4
2581 let y = 1 {
2582 "\(x)": x + y
2583 }
2584}
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002585d: { "1": 2, "2": 3, "3": 4 }
2586```
2587
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002588
2589### String interpolation
2590
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002591String interpolation allows constructing strings by replacing placeholder
2592expressions with their string representation.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002593String interpolation may be used in single- and double-quoted strings, as well
2594as their multiline equivalent.
2595
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002596A placeholder consists of "\(" followed by an expression and a ")". The
2597expression is evaluated within the scope within which the string is defined.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002598
2599```
2600a: "World"
2601b: "Hello \( a )!" // Hello World!
2602```
2603
2604
2605## Builtin Functions
2606
2607Built-in functions are predeclared. They are called like any other function.
2608
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002609
2610### `len`
2611
2612The built-in function `len` takes arguments of various types and return
2613a result of type int.
2614
2615```
2616Argument type Result
2617
2618string string length in bytes
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002619bytes length of byte sequence
2620list list length, smallest length for an open list
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002621struct number of distinct data fields, including optional
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002622```
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002623<!-- TODO: consider not supporting len, but instead rely on more
2624precisely named builtin functions:
2625 - strings.RuneLen(x)
2626 - bytes.Len(x) // x may be a string
2627 - struct.NumFooFields(x)
2628 - list.Len(x)
2629-->
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002630
2631```
2632Expression Result
2633len("Hellø") 6
2634len([1, 2, 3]) 3
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002635len([1, 2, ...]) >=2
Marcel van Lohuizen45163fa2019-01-22 15:53:32 +01002636```
2637
Marcel van Lohuizen62658a82019-06-16 12:18:47 +02002638
2639### `close`
2640
2641The builtin function `close` converts a partially defined, or open, struct
2642to a fully defined, or closed, struct.
2643
2644
Marcel van Lohuizena460fe82019-04-26 10:20:51 +02002645### `and`
2646
2647The built-in function `and` takes a list and returns the result of applying
2648the `&` operator to all elements in the list.
2649It returns top for the empty list.
2650
Adieu5b4fa8b2019-12-03 19:20:58 +01002651```
Marcel van Lohuizena460fe82019-04-26 10:20:51 +02002652Expression: Result
2653and([a, b]) a & b
2654and([a]) a
2655and([]) _
Adieu5b4fa8b2019-12-03 19:20:58 +01002656```
Marcel van Lohuizena460fe82019-04-26 10:20:51 +02002657
2658### `or`
2659
2660The built-in function `or` takes a list and returns the result of applying
2661the `|` operator to all elements in the list.
2662It returns bottom for the empty list.
2663
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002664```
Marcel van Lohuizena460fe82019-04-26 10:20:51 +02002665Expression: Result
Adieu5b4fa8b2019-12-03 19:20:58 +01002666or([a, b]) a | b
2667or([a]) a
2668or([]) _|_
Marcel van Lohuizen6c35af62019-05-06 10:50:57 +02002669```
Marcel van Lohuizena460fe82019-04-26 10:20:51 +02002670
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002671
Marcel van Lohuizen6713ae22019-01-26 14:42:25 +01002672## Cycles
2673
2674Implementations are required to interpret or reject cycles encountered
2675during evaluation according to the rules in this section.
2676
2677
2678### Reference cycles
2679
2680A _reference cycle_ occurs if a field references itself, either directly or
2681indirectly.
2682
2683```
2684// x references itself
2685x: x
2686
2687// indirect cycles
2688b: c
2689c: d
2690d: b
2691```
2692
2693Implementations should report these as an error except in the following cases:
2694
2695
2696#### Expressions that unify an atom with an expression
2697
2698An expression of the form `a & e`, where `a` is an atom
2699and `e` is an expression, always evaluates to `a` or bottom.
2700As it does not matter how we fail, we can assume the result to be `a`
2701and validate after the field in which the expression occurs has been evaluated
2702that `a == e`.
2703
2704```
Marcel van Lohuizeneac8f9a2019-08-03 13:53:56 +02002705// Config Evaluates to (requiring concrete values)
Marcel van Lohuizen6713ae22019-01-26 14:42:25 +01002706x: { x: {
2707 a: b + 100 a: _|_ // cycle detected
2708 b: a - 100 b: _|_ // cycle detected
2709} }
2710
2711y: x & { y: {
2712 a: 200 a: 200 // asserted that 200 == b + 100
2713 b: 100
2714} }
2715```
2716
2717
2718#### Field values
2719
2720A field value of the form `r & v`,
2721where `r` evaluates to a reference cycle and `v` is a value,
2722evaluates to `v`.
2723Unification is idempotent and unifying a value with itself ad infinitum,
2724which is what the cycle represents, results in this value.
2725Implementations should detect cycles of this kind, ignore `r`,
2726and take `v` as the result of unification.
Marcel van Lohuizen0d0b9ad2019-10-10 18:19:28 +02002727
Marcel van Lohuizen6713ae22019-01-26 14:42:25 +01002728<!-- Tomabechi's graph unification algorithm
2729can detect such cycles at near-zero cost. -->
2730
2731```
2732Configuration Evaluated
2733// c Cycles in nodes of type struct evaluate
2734// ↙︎ ↖ to the fixed point of unifying their
2735// a → b values ad infinitum.
2736
2737a: b & { x: 1 } // a: { x: 1, y: 2, z: 3 }
2738b: c & { y: 2 } // b: { x: 1, y: 2, z: 3 }
2739c: a & { z: 3 } // c: { x: 1, y: 2, z: 3 }
2740
2741// resolve a b & {x:1}
2742// substitute b c & {y:2} & {x:1}
2743// substitute c a & {z:3} & {y:2} & {x:1}
2744// eliminate a (cycle) {z:3} & {y:2} & {x:1}
2745// simplify {x:1,y:2,z:3}
2746```
2747
2748This rule also applies to field values that are disjunctions of unification
2749operations of the above form.
2750
2751```
2752a: b&{x:1} | {y:1} // {x:1,y:3,z:2} | {y:1}
2753b: {x:2} | c&{z:2} // {x:2} | {x:1,y:3,z:2}
2754c: a&{y:3} | {z:3} // {x:1,y:3,z:2} | {z:3}
2755
2756
2757// resolving a b&{x:1} | {y:1}
2758// substitute b ({x:2} | c&{z:2})&{x:1} | {y:1}
2759// simplify c&{z:2}&{x:1} | {y:1}
2760// substitute c (a&{y:3} | {z:3})&{z:2}&{x:1} | {y:1}
2761// simplify a&{y:3}&{z:2}&{x:1} | {y:1}
2762// eliminate a (cycle) {y:3}&{z:2}&{x:1} | {y:1}
2763// expand {x:1,y:3,z:2} | {y:1}
2764```
2765
2766Note that all nodes that form a reference cycle to form a struct will evaluate
2767to the same value.
2768If a field value is a disjunction, any element that is part of a cycle will
2769evaluate to this value.
2770
2771
2772### Structural cycles
2773
2774CUE disallows infinite structures.
2775Implementations must report an error when encountering such declarations.
2776
2777<!-- for instance using an occurs check -->
2778
2779```
2780// Disallowed: a list of infinite length with all elements being 1.
2781list: {
2782 head: 1
2783 tail: list
2784}
2785
2786// Disallowed: another infinite structure (a:{b:{d:{b:{d:{...}}}}}, ...).
2787a: {
2788 b: c
2789}
2790c: {
2791 d: a
2792}
2793```
2794
2795It is allowed for a value to define an infinite set of possibilities
2796without evaluating to an infinite structure itself.
2797
2798```
2799// List defines a list of arbitrary length (default null).
2800List: *null | {
2801 head: _
2802 tail: List
2803}
2804```
2805
2806<!--
Marcel van Lohuizen7f48df72019-02-01 17:24:59 +01002807Consider banning any construct that makes CUE not having a linear
2808running time expressed in the number of nodes in the output.
2809
2810This would require restricting constructs like:
2811
2812(fib&{n:2}).out
2813
2814fib: {
2815 n: int
2816
2817 out: (fib&{n:n-2}).out + (fib&{n:n-1}).out if n >= 2
2818 out: fib({n:n-2}).out + fib({n:n-1}).out if n >= 2
2819 out: n if n < 2
2820}
2821
2822-->
2823<!--
Marcel van Lohuizen6713ae22019-01-26 14:42:25 +01002824### Unused fields
2825
2826TODO: rules for detection of unused fields
2827
28281. Any alias value must be used
2829-->
2830
2831
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002832## Modules, instances, and packages
2833
2834CUE configurations are constructed combining _instances_.
2835An instance, in turn, is constructed from one or more source files belonging
2836to the same _package_ that together declare the data representation.
2837Elements of this data representation may be exported and used
2838in other instances.
2839
2840### Source file organization
2841
2842Each source file consists of an optional package clause defining collection
2843of files to which it belongs,
2844followed by a possibly empty set of import declarations that declare
2845packages whose contents it wishes to use, followed by a possibly empty set of
2846declarations.
2847
Marcel van Lohuizen1f5a9032019-09-09 23:53:42 +02002848Like with a struct, a source file may contain embeddings.
2849Unlike with a struct, the embedded expressions may be any value.
2850If the result of the unification of all embedded values is not a struct,
2851it will be output instead of its enclosing file when exporting CUE
2852to a data format
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002853
2854```
Marcel van Lohuizen0a6f7c92020-01-28 11:10:38 +01002855SourceFile = [ PackageClause "," ] { ImportDecl "," } { Declaration "," } .
Marcel van Lohuizen1f5a9032019-09-09 23:53:42 +02002856```
2857
2858```
2859"Hello \(place)!"
2860
2861place: "world"
2862
2863// Outputs "Hello world!"
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002864```
2865
2866### Package clause
2867
2868A package clause is an optional clause that defines the package to which
2869a source file the file belongs.
2870
2871```
2872PackageClause = "package" PackageName .
2873PackageName = identifier .
2874```
2875
2876The PackageName must not be the blank identifier.
2877
2878```
2879package math
2880```
2881
2882### Modules and instances
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002883A _module_ defines a tree of directories, rooted at the _module root_.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002884
2885All source files within a module with the same package belong to the same
2886package.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002887<!-- jba: I can't make sense of the above sentence. -->
2888A module may define multiple packages.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002889
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002890An _instance_ of a package is any subset of files belonging
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002891to the same package.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002892<!-- jba: Are you saying that -->
2893<!-- if I have a package with files a, b and c, then there are 8 instances of -->
2894<!-- that package, some of which are {a, b}, {c}, {b, c}, and so on? What's the -->
2895<!-- purpose of that definition? -->
2896It is interpreted as the concatenation of these files.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002897
2898An implementation may impose conventions on the layout of package files
2899to determine which files of a package belongs to an instance.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002900For example, an instance may be defined as the subset of package files
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002901belonging to a directory and all its ancestors.
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002902<!-- jba: OK, that helps a little, but I still don't see what the purpose is. -->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002903
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002904
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002905### Import declarations
2906
2907An import declaration states that the source file containing the declaration
2908depends on definitions of the _imported_ package (§Program initialization and
2909execution) and enables access to exported identifiers of that package.
2910The import names an identifier (PackageName) to be used for access and an
2911ImportPath that specifies the package to be imported.
2912
2913```
Marcel van Lohuizen40178752019-08-25 19:17:56 +02002914ImportDecl = "import" ( ImportSpec | "(" { ImportSpec "," } ")" ) .
Marcel van Lohuizenfbab65d2019-08-13 16:51:15 +02002915ImportSpec = [ PackageName ] ImportPath .
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002916ImportLocation = { unicode_value } .
2917ImportPath = `"` ImportLocation [ ":" identifier ] `"` .
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002918```
2919
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002920The PackageName is used in qualified identifiers to access
2921exported identifiers of the package within the importing source file.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002922It is declared in the file block.
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002923It defaults to the identifier specified in the package clause of the imported
2924package, which must match either the last path component of ImportLocation
2925or the identifier following it.
2926
2927<!--
2928Note: this deviates from the Go spec where there is no such restriction.
2929This restriction has the benefit of being to determine the identifiers
2930for packages from within the file itself. But for CUE it is has another benefit:
2931when using package hiearchies, one is more likely to want to include multiple
2932packages within the same directory structure. This mechanism allows
2933disambiguation in these cases.
2934-->
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002935
2936The interpretation of the ImportPath is implementation-dependent but it is
2937typically either the path of a builtin package or a fully qualifying location
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002938of a package within a source code repository.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002939
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002940An ImportLocation must be a non-empty strings using only characters belonging
2941Unicode's L, M, N, P, and S general categories
2942(the Graphic characters without spaces)
2943and may not include the characters !"#$%&'()*,:;<=>?[\]^`{|}
2944or the Unicode replacement character U+FFFD.
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002945
Jonathan Amsterdame4790382019-01-20 10:29:29 -05002946Assume we have package containing the package clause "package math",
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002947which exports function Sin at the path identified by "lib/math".
2948This table illustrates how Sin is accessed in files
2949that import the package after the various types of import declaration.
2950
2951```
2952Import declaration Local name of Sin
2953
2954import "lib/math" math.Sin
Marcel van Lohuizen7414fae2019-08-13 17:26:35 +02002955import "lib/math:math" math.Sin
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002956import m "lib/math" m.Sin
Marcel van Lohuizendd5e5892018-11-22 23:29:16 +01002957```
2958
2959An import declaration declares a dependency relation between the importing and
2960imported package. It is illegal for a package to import itself, directly or
2961indirectly, or to directly import a package without referring to any of its
2962exported identifiers.
2963
2964
2965### An example package
2966
2967TODO