Skip to content

Commit

Permalink
Formatting
Browse files Browse the repository at this point in the history
  • Loading branch information
vrugtehagel committed Jul 15, 2024
1 parent d874928 commit 5200d27
Show file tree
Hide file tree
Showing 3 changed files with 56 additions and 36 deletions.
30 changes: 24 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,17 +72,35 @@ the `.addPlugin()` call, as an object. It may have the following keys:

## About the codebase

In theory, asset hashing is relatively simple. A hash is constructed from each file's contents, and references to said files are modified to get a query parameter. This causes the browser to use cached versions of files when they didn't change, and to download the new version when the hash changed.

While this concept sounds simple, it can get a little complex. For one, we can't just first hash all the files and then add the query parameters, since adding the query parameters to a file changes the hash. For example, if file A references file B, and file B changes, then the naive method would not cause a change in file A even though it needs to be re-requested simply because it has a new reference to B.

In other words, to properly do asset hashing, we need to build a dependency tree of sorts, and hash leaves until nothing is left. Unfortunately, there's another issue; circular dependencies. If A depends on B and vice versa, then we can't add the correct hash parameters because B's hash is included in A and vice versa, meaning each hash is dependent on the other. To circumvent this issue, we hash all files within circular dependencies once, replace the hashes inside them, and then hash them again, replacing the hashes one last time. This ensures that if one file in the loop changes, all of them get a new hash; and if none of them changes, all the hashes remain the same.
In theory, asset hashing is relatively simple. A hash is constructed from each
file's contents, and references to said files are modified to get a query
parameter. This causes the browser to use cached versions of files when they
didn't change, and to download the new version when the hash changed.

While this concept sounds simple, it can get a little complex. For one, we can't
just first hash all the files and then add the query parameters, since adding
the query parameters to a file changes the hash. For example, if file A
references file B, and file B changes, then the naive method would not cause a
change in file A even though it needs to be re-requested simply because it has a
new reference to B.

In other words, to properly do asset hashing, we need to build a dependency tree
of sorts, and hash leaves until nothing is left. Unfortunately, there's another
issue; circular dependencies. If A depends on B and vice versa, then we can't
add the correct hash parameters because B's hash is included in A and vice
versa, meaning each hash is dependent on the other. To circumvent this issue, we
hash all files within circular dependencies once, replace the hashes inside
them, and then hash them again, replacing the hashes one last time. This ensures
that if one file in the loop changes, all of them get a new hash; and if none of
them changes, all the hashes remain the same.

So, in broad steps, here's what we do:

1. Index all files that need to be processed.
2. Identify the referenced assets within those files, marking their positions.
3. If files exist that only reference assets that are not also indexed files; add the hashes to these files, and remove them from the index. Repeat this until no more such files exist.
3. If files exist that only reference assets that are not also indexed files;
add the hashes to these files, and remove them from the index. Repeat this
until no more such files exist.
4. Hash all the remaining files as-is.
5. Replace the references to the assets/files with their hash.
6. Hash all the remaining files once more.
Expand Down
60 changes: 31 additions & 29 deletions src/asset-hash.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,22 @@ export async function assetHash(
algorithm = "SHA-256",
maxLength = Infinity,
param = "v",
computeChecksum: computer = async (content: ArrayBuffer): Promise<string> => {
computeChecksum: computer = async (
content: ArrayBuffer,
): Promise<string> => {
const buffer = await crypto.subtle.digest(algorithm, content);
const uint8Array = new Uint8Array(buffer);
return btoa(String.fromCharCode(...uint8Array));
}
},
} = options;

/** Create a normalized `computeChecksum` that incorporates maxLength */
const hasMaxLength = Number.isFinite(maxLength) && maxLength > 0;
const computeChecksum = async (content: ArrayBuffer): Promise<string> => {
const hash = await computer(content)
if(!hasMaxLength) return hash;
return hash.slice(0, maxLength)
}
const hash = await computer(content);
if (!hasMaxLength) return hash;
return hash.slice(0, maxLength);
};

/** This is going to help resolving asset paths that we find */
const resolver = new PathResolver({
Expand All @@ -47,7 +49,7 @@ export async function assetHash(
const hashCache = new Map<string, Promise<string | null>>();
async function hashFile(path: string): Promise<string | null> {
const cached = hashCache.get(path);
if(cached) return await cached;
if (cached) return await cached;
const asyncResult = forceHashFile(path);
hashCache.set(path, asyncResult);
return await asyncResult;
Expand All @@ -57,7 +59,7 @@ export async function assetHash(
}
async function forceHashFile(path: string): Promise<string | null> {
const content = await fs.readFile(path).catch(() => null);
if(!content) return null;
if (!content) return null;
return await computeChecksum(content);
}

Expand All @@ -80,7 +82,7 @@ export async function assetHash(
hash?: string | null;
naiveHash?: string | null;
inserted?: number;
}
};
const referenceMap = new Map<string, Reference[]>();

/** Valid URL path characters are [!$%(-;@-[\]_a-z~].
Expand All @@ -95,14 +97,14 @@ export async function assetHash(
"g", // This is not part of the regex, it's just a flag
);

for(const filePath of filePaths){
for (const filePath of filePaths) {
const content = await fs.readFile(filePath, { encoding: "utf8" });
const matches = [...content.matchAll(pathRegex)];
const references = []
for(const match of matches){
const references = [];
for (const match of matches) {
const text = match[0];
const path = resolver.resolve(text, filePath);
if(path == null) continue;
if (path == null) continue;
const endIndex = match.index + text.length;
const hasParams = content[endIndex] == "?";
references.push({ text, path, endIndex, hasParams });
Expand All @@ -118,28 +120,28 @@ export async function assetHash(
previousSize = referenceMap.size;
await Promise.all([...referenceMap].map(async ([path, references]) => {
const hasDependencies = references
.some((reference) => referenceMap.has(reference.path))
.some((reference) => referenceMap.has(reference.path));
if (hasDependencies) return;
const allHashing = []
for(const reference of references){
const allHashing = [];
for (const reference of references) {
const promise = hashFile(reference.path);
promise.then(hash => reference.hash = hash);
promise.then((hash) => reference.hash = hash);
allHashing.push(promise);
}
await Promise.all(allHashing);
for(let index = references.length - 1; index >= 0; index--){
for (let index = references.length - 1; index >= 0; index--) {
const reference = references[index];
if(reference.hash != null) continue;
if (reference.hash != null) continue;
references.splice(index, 1);
}
if(references.length == 0){
if (references.length == 0) {
referenceMap.delete(path);
return;
}
const content = await fs.readFile(path, { encoding: "utf8" });
let transformed = content;
let offset = 0;
for(const reference of references){
for (const reference of references) {
reference.endIndex += offset;
const { endIndex, hasParams } = reference;
const hash = reference.hash as string;
Expand All @@ -151,7 +153,7 @@ export async function assetHash(
await fs.writeFile(path, transformed);
referenceMap.delete(path);
}));
} while(referenceMap.size < previousSize);
} while (referenceMap.size < previousSize);

/** Now that we've got all the "leaves" out of the way, the `referenceMap`
* now only contains paths with dependencies that are also in said
Expand All @@ -160,9 +162,9 @@ export async function assetHash(
* 4. Hash all the remaining files as-is. */
await Promise.all([...referenceMap].map(async ([path, references]) => {
await Promise.all(references.map(async (reference) => {
if(reference.hash) return;
if (reference.hash) return;
const hash = await hashFile(reference.path);
if(referenceMap.has(reference.path)){
if (referenceMap.has(reference.path)) {
reference.naiveHash = hash;
} else {
reference.hash = hash;
Expand All @@ -175,7 +177,7 @@ export async function assetHash(
let offset = 0;
const content = await fs.readFile(path, { encoding: "utf8" });
let transformed = content;
for(const reference of references){
for (const reference of references) {
reference.endIndex += offset;
const { hasParams, endIndex } = reference;
const hash = reference.hash ?? reference.naiveHash as string;
Expand All @@ -192,7 +194,7 @@ export async function assetHash(
/** 6. Hash all the remaining files once more. */
await Promise.all([...referenceMap].map(async ([path, references]) => {
await Promise.all(references.map(async (reference) => {
if(reference.hash) return;
if (reference.hash) return;
forgetHash(reference.path);
const hash = await hashFile(reference.path);
// console.log(reference.path, hash, reference.naiveHash, reference);
Expand All @@ -205,17 +207,17 @@ export async function assetHash(
let offset = 0;
const content = await fs.readFile(path, { encoding: "utf8" });
let transformed = content;
for(const reference of references){
for (const reference of references) {
reference.endIndex += offset;
if(reference.hash != null) continue;
if (reference.hash != null) continue;
const { hasParams, endIndex } = reference;
const inserted = reference.inserted as number;
const hash = reference.naiveHash as string;
transformed = stringSplice(
transformed,
endIndex + 1,
inserted - 1,
`${param}=${hash}`
`${param}=${hash}`,
);
offset += param.length + hash.length + 2 - inserted;
}
Expand Down
2 changes: 1 addition & 1 deletion src/string-splice.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ export function stringSplice(
target: string,
index: number,
deleteCount: number = 0,
insertion: string = '',
insertion: string = "",
): string {
return target.slice(0, index) + insertion + target.slice(index + deleteCount);
}

0 comments on commit 5200d27

Please sign in to comment.