<p>A language definition is basically just a JSON value describing various properties of your language. Recognized attributes are:</p>
<dl>
<dt>ignoreCase</dt><dd>(optional=<code>false</code>, boolean) Is the language case insensitive?. The regular expressions in the tokenizer use this to do case (in)sensitive matching, as well
as tests in the <code>cases</code> construct.</dd>
<dt>defaultToken</dt><dd>(optional=<code>"source"</code>, string) The default token returned if nothing matches in the tokenizer. It can be convenient to set this to <code>"invalid"</code> during development of your colorizer to easily spot what is not matched yet.</dd>
<dtid="brackets">brackets</dt><dd>(optional, array of bracket definitions) This is used by the tokenizer to easily define matching braces. See <ahref="#@brackets"><codeclass="dt">@brackets</code></a> and <ahref="#bracket"><codeclass="dt">bracket</code></a> for more information. Each bracket definition is an array of 3 elements, or object, describing the <code>open</code> brace, the <code>close</code> brace, and the <code>token</code> class. The default definition is:
<preclass="highlight">
[ ['{','}','delimiter.curly'],
[ ['{','}','delimiter.curly'],
['[',']','delimiter.square'],
['(',')','delimiter.parenthesis'],
['<','>','delimiter.angle'] ]</pre>
</dd>
<dt>tokenizer</dt><dd>(required, object with states) This defines the tokenization rules – see the next section for a detailed description.</dd>
</dl>
<p>There are more attributes that can be specified which are described in the <ahref="#moreattr">advanced attributes</a> section later in this document.</p>
<h2>Creating a tokenizer</h2>
<p>The <code>tokenizer</code> attribute describes how lexical analysis takes place, and how the input is divided into tokens. Each token is given a CSS class name which is used to render each token in the editor. Standard CSS token classes include:</p>
<preclass="highlight"type="text/plain">
<preclass="highlight">
identifier entity constructor
operators tag namespace
keyword info-token type
@ -185,7 +185,7 @@ meta .[content]</pre>
<dtid="token">{ token: <em>tokenclass</em> }</dt><dd>An object that defines the token class used with CSS rendering. Common token classes are for example <code>'keyword'</code>, <code>'comment'</code> or <code>'identifier'</code>. You can use a dot to use hierarchical CSS names, like <code>'type.identifier'</code> or <code>'string.escape'</code>. You can also include <code>$</code> patterns that are substituted with a captured group from the matched input or the tokenizer state. The patterns are described in the <ahref="#pattern">guard section</a> of this document.
There are some special token classes:
<dl>
<dtid="@brackets">"@brackets"</dt> or
<dtid="@brackets">"@brackets"</dt><dd>or</dd>
<dt>"@brackets.<em>tokenclass</em></dt><dd>Signifies that brackets were tokenized. The token class for CSS is determined by the token class defined in the <ahref="#brackets"><code>brackets</code></a> attribute (together with <code><em>tokenclass</em></code> if present). Moreover, <ahref="#bracket"><codeclass="dt">bracket</code></a> attribute is set such that the editor is matches the braces (and does auto indentation). For example:
<dtid="bracket">bracket: <em>kind</em></dt><dd><spanclass="adv">(Advanced)</span> The <code><em>kind</em></code> can be either <code>'@open'</code> or <code>'@close'</code>. This signifies that a token is either an open or close brace. This attribute is set automatically if the token class is <ahref="#@brackets"><codeclass="dt">@brackets</code></a>.
The editor uses the bracket information to show matching braces (where an open bracket matches with a close bracket if their token classes are the same). Moreover, when a user opens a new line the editor will do auto indentation on open braces. Normally, this attribute does not need to be set if you are using the <ahref="#brackets"><codeclass="dt">brackets</code></a> attribute and it is only used for complex brace matching. This is discussed further in the next section on <ahref="#complexmatch">advanced brace matching</a>.</dd>
<dtid="nextEmbedded">nextEmbedded: <em>langId</em><spanstyle="color=black">or</span> '@pop'</dt><dd><spanclass="adv">(Advanced)</span> Signifies to the editor that this token is followed by code in another language specified by the <code><em>langId</em></code>, i.e. for example <code>javascript</code>. Internally, our syntax highlighter keeps tokenizing the source until it finds an an ending sequence. At that point, you can use <codeclass="dt">nextEmbedded</code> with a <codeclass="dt">'@pop'</code> value to pop out of the embedded mode again. Usually, we need to use a <codeclass="dt">next</code> attribute too to switch to a state where we can tokenize the foreign code. As an example, here is how we could support CSS fragments in our language:
<dtid="nextEmbedded">nextEmbedded: <em>langId</em><span>or</span> '@pop'</dt><dd><spanclass="adv">(Advanced)</span> Signifies to the editor that this token is followed by code in another language specified by the <code><em>langId</em></code>, i.e. for example <code>javascript</code>. Internally, our syntax highlighter keeps tokenizing the source until it finds an an ending sequence. At that point, you can use <codeclass="dt">nextEmbedded</code> with a <codeclass="dt">'@pop'</code> value to pop out of the embedded mode again. Usually, we need to use a <codeclass="dt">next</code> attribute too to switch to a state where we can tokenize the foreign code. As an example, here is how we could support CSS fragments in our language:
<dtid="log">log: <em>message</em></dt><dd>Used for debugging. Logs <code><em>message</em></code> to the console window in the browser (press F12 to see it). This can be useful to see if a certain action is executing. For example:
<preclass="highlight">[/\d+/, { token: 'number', log: 'found number $0 in state $S0' } ]</pre>
</dd>
<p> </p>
<dd> </dd>
<!--
<dt>bracketType: <em>bracketType</em></dt><dd>If <code>token</code> is <code>"@brackets"</code>, this attribute can specify for an arbitrary matched input (like <code>"end"</code>), which is not present in the <code>brackets</code> attribute, what kind of bracket this is: <code>"@open"</code>, <code>"@close"</code>, or <code>"@none"</code>.</dd>