JavaScriptStrings

Strings

A string is JavaScript's primitive for text — anything from a single character to a JSON blob with megabytes of data. Strings are immutable: methods that look like they modify a string always return a new one and leave the original alone. Internally JavaScript stores strings as sequences of 16-bit code units (UTF-16), which mostly doesn't matter — until it does, with emoji and other characters above U+FFFF.

Three ways to write a string literal

JS
const single = 'hello';
const double = "hello";
const back   = `hello`;

console.log(single === double && double === back); // true

Single and double quotes are interchangeable. Most teams pick one — single quotes are the most common convention, partly because they avoid escaping in English contractions like "don't". Backticks are template literals, which support interpolation and multi-line strings; see the next page for details.

Escapes

When a character is hard to type, hard to read, or has a special meaning, escape it with a backslash:

JS
"line one\nline two"    // newline
"tab\there"             // tab
"quote: \"hi\""        // literal double quote
'O\'Reilly'             // literal single quote
"\\backslash"          // literal backslash
"\u00e9"                // é   — 4-digit unicode escape
"\u{1F600}"             // 😀  — code-point escape (any length)
Length, indexing and immutability

JS
const s = "hello";

s.length;      // 5
s[0];          // "h"
s[4];          // "o"
s[10];         // undefined

s.at(-1);      // "o"   — like Python, negative indexes work
s.charAt(2);   // "l"
s.charCodeAt(0); // 104  — UTF-16 code unit
s.codePointAt(0); // 104 — handles surrogate pairs properly

s[0] = "H";    // silently does nothing in non-strict mode
console.log(s); // "hello" — strings are immutable
Strings are immutable
Assigning to `s[0]` does **not** change the string. To "edit" a string you build a new one — slice, concatenate, or use a method that returns a new string.
Concatenation

JS
const greeting = "Hello, " + "world!";   // "+" is the classic operator
const name = "Ada";
const out = "Hi " + name + "!";          // works but gets verbose

const better = `Hi ${name}!`;             // template literal — preferred

// String.prototype.concat exists but is almost never used
"abc".concat("def", "ghi");              // "abcdefghi"
Comparison and ordering

Strings compare with <, >, <= and >= based on UTF-16 code-unit order — which is not the same as alphabetical order for many languages.

JS
"apple" < "banana";   // true
"Z" < "a";            // true  — uppercase letters come before lowercase in ASCII
"é" < "f";            // true
"10" < "9";           // true  — string comparison is lexicographic!

// For human-friendly sorting use Intl.Collator or localeCompare
"é".localeCompare("f", "fr");  // -1, 0, or 1
["10", "9", "11"].sort((a, b) => a.localeCompare(b, undefined, { numeric: true }));
// ["9", "10", "11"]
Sorting user-visible text
If you sort strings that real people will read, pass them through `Intl.Collator` or `localeCompare` with the user's locale. The default sort puts `"Z"` before `"a"` and `"é"` after `"z"` — usually not what you want.
Equality with == vs ===

JS
"5" == 5;            // true  — == coerces, "5" becomes 5
"5" === 5;           // false — different types
"abc" === "abc";     // true  — same characters, same type
new String("a") === "a";   // false — primitive vs object wrapper

new String("a") == "a";    // true  — == coerces here too
Avoid `new String(...)`
The `new String`, `new Number`, `new Boolean` constructors create *wrapper objects* that behave subtly differently from their primitives. You essentially never need them — string literals and the `String()` function (without `new`) cover every real-world case.
Strings and Unicode (the short version)

JavaScript strings are sequences of UTF-16 code units. ASCII and the common European alphabets fit in one code unit each, so length and indexing "just work". But characters above U+FFFF — emoji, many CJK characters, mathematical symbols — are encoded as surrogate pairs: two code units.

JS
const cafe = "café";
cafe.length;            // 4 — one code unit per char

const smile = "😀";
smile.length;           // 2 — surrogate pair!
smile[0];               // "\uD83D"  — half of a character
[...smile].length;      // 1 — spreading into an array uses code-point iteration

// String.prototype.codePointAt handles surrogate pairs
smile.codePointAt(0);   // 128512
String.fromCodePoint(128512); // "😀"

Two takeaways:

  • length and s[i] count code units, not visible characters.

  • To iterate over user-visible characters, use [...s] or for (const ch of s). Both are code-point aware.

  • For grapheme clusters (flags, family emoji with skin-tone modifiers, accented letters built from combining marks), use Intl.Segmenter — even code points are not enough.

Counting visible characters correctly

JS
const flag = "🇯🇵"; // one flag, but...

flag.length;             // 4 — two regional-indicator pairs
[...flag].length;        // 2 — two code points
new Intl.Segmenter("en", { granularity: "grapheme" })
  .segment(flag);        // iterating gives 1 segment
Strings as iterables

Strings are iterable, which means many array-like patterns just work:

JS
for (const ch of "Ada") console.log(ch);  // "A", "d", "a"
const chars = [..."Ada"];                  // ["A", "d", "a"]
Array.from("Ada");                         // ["A", "d", "a"]

// String -> array of code points -> back to string
[..."héllo"].reverse().join("");           // "olléh"
A
d
a

That covers the shape of strings. The next page is a tour of the methods you'll reach for every day — slice, split, includes, replace and friends.