Identifiers and patterns
In Nanoc, every item (page or asset) and every layout has a unique identifier: a string derived from the file’s path. A pattern is an expression that is used to select items or layouts based on their identifier.
Identifiers
Identifiers come in two types: the full type, new in Nanoc 4, and the legacy type, used in Nanoc 3.
- full
- An identifier with the full type is the filename, with the path to the content directory removed. For example, the file /Users/denis/stoneship/content/about.md will have the full identifier /about.md.
- legacy
- An identifier with the legacy type is the filename, with the path to the content directory removed, the extension removed, and a slash appended. For example, the file /Users/denis/stoneship/content/about.md will have the legacy identifier /about/. This corresponds closely with paths in clean URLs.
The following methods are useful for full identifiers:
-
identifier.ext
→ String -
The last extension of this identifier. For example:
Nanoc::Identifier.new('/about.md').ext # => "md" Nanoc::Identifier.new('/about.html.erb').ext # => "erb"
-
identifier.exts
→ Array of Strings -
All extensions of this identifier. For example:
Nanoc::Identifier.new('/about.html.erb').exts # => ["html", "erb"]
-
identifier.components
→ Array of Strings -
Identifier split by slash. For example:
Nanoc::Identifier.new('/software/nanoc.md').components # => ["software", "nanoc.md"]
-
identifier.match?(pattern)
→true
,false
-
True if the identifier matches the pattern (either a String or a Regexp), false otherwise. For example:
Nanoc::Identifier.new('/software/nanoc.md').match?('/software/*') # => true Nanoc::Identifier.new('/software/nanoc.md').match?('/soft*') # => false
-
identifier.without_ext
→ String -
Identifier with the last extension removed. For example:
Nanoc::Identifier.new('/software/nanoc.md').without_ext # => "/software/nanoc" Nanoc::Identifier.new('/about.html.erb').without_ext # => "/about.html"
-
identifier.without_exts
→ String -
Identifier with all extensions removed For example:
Nanoc::Identifier.new('/about.html.erb').without_exts # => "/about"
-
identifier + string
→ String -
Identifier with the given string appended. For example:
Nanoc::Identifier.new('/software') + '/nanoc' # => "/software/nanoc"
identifier =~ pat
-
Truthy if the identifier matches the pattern (either a String or a Regexp), falsy otherwise. For example:
Nanoc::Identifier.new('/software/nanoc.md') =~ '/software/*' # => 0
The following method is useful for legacy identifiers:
-
identifier.chop
→ String -
Identifier with the last character removed. For example:
identifier = Nanoc::Identifier.new('/about/', type: :legacy) identifier.to_s # => "/about/" identifier.chop # => "/about" identifier.chop + '.html' # => "/about.html" identifier + 'index.html' # => "/about/index.html"
Patterns
Patterns are used to find items and layouts based on their identifier. They come in three varieties:
- glob patterns
- regular expression patterns
- legacy patterns
Glob patterns
Glob patterns are strings that contain wildcard characters. Wildcard characters are characters that can be substituted for other characters in an identifier. An example of a glob pattern is /projects/*.md, which matches all files with a md extension in the /projects directory.
Globs are commonplace in Unix-like environments. For example, the Unix command for listing all files with the md extension in the current directory is ls *.md. In this example, the argument to the ls command is a wildcard.
Nanoc supports the following wildcards in glob patterns:
*
- Matches any file or directory name. Does not cross directory boundaries. For example, /projects/*.md matches /projects/nanoc.md, but not /projects/cri.adoc nor /projects/nanoc/about.md.
**/
- Matches zero or more levels of nested directories. For example, /projects/**/*.md matches both /projects/nanoc.md and /projects/nanoc/history.md.
?
- Matches a single character.
[abc]
- Matches any single character in the set. For example, /people/[kt]im.md matches only /people/kim.md and /people/tim.md.
{foo,bar}
- Matches either string in the comma-separated list. More than two strings are possible. For example, /c{at,ub,ount}s.txt matches /cats.txt, /cubs.txt, and /counts.txt, but not /cabs.txt.
A glob pattern that matches every item is /**/*. A glob pattern that matches every item/layout with the extension md is /**/*.md.
Regular expression patterns
You can use a regular expression to select items and layouts.
For matching identifiers, the %r{…}
syntax is (arguably) nicer than the /…/
syntax. The latter is not a good fit for identifiers (or filenames), because all slashes need to be escaped. The \A
and \z
anchors are also useful to make sure the entire identifier is matched.
An example of a regular expression pattern is %r{\A/projects/(cri|nanoc)\.md\z}
, which matches both /projects/nanoc.md and /projects/cri.md.
Legacy patterns
Legacy patterns are strings that contain wildcard characters. The wildcard characters behave differently than the glob wildcard characters.
To enable legacy patterns, set string_pattern_type
to "legacy"
in the configuration. For example:
string_pattern_type: "legacy"
For legacy patterns, Nanoc supports the following wildcards:
*
- Matches zero or more characters, including a slash. For example, /projects/*/ matches /projects/nanoc/ and /projects/nanoc/about/, but not /projects/.
+
- Matches one or more characters, including a slash. For example, /projects/+ matches /projects/nanoc/ and /projects/nanoc/about/, but not /projects/.