-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow digits in the local part of prefixed names #470
Conversation
WalkthroughThe changes in this pull request involve modifications to the Changes
Possibly related issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (4)
test/N3StreamWriter-test.js (1)
42-46
: Consider expanding test cases for comprehensive digit support.While the current test case covers basic digit support (
3a
,a3
), consider adding test cases for:
- Multiple digits (e.g.,
a:123
)- Only digits (e.g.,
a:456
)- Digits in different positions (e.g.,
a:1a2b3
)src/N3Writer.js (1)
300-300
: Consider adding explicit test cases for edge cases.While the change is correct, consider adding test cases for various edge cases:
- Local names starting with different digits (0-9)
- Mixed digit-letter combinations
- Multiple prefixed names with digits in a single document
Would you like me to help generate comprehensive test cases for these scenarios?
test/N3Writer-test.js (2)
274-274
: Ensure consistency in prefix usage in test casesIn the test case starting at line 274, the prefixes used in the serialized output include
a:3a
,b:3a
, andb:a3
. While digits are now allowed in the local part of prefixed names, it's important to ensure that the test cases cover various scenarios, including prefixes that start with digits and those that contain digits in different positions.Consider adding additional test cases where the local name starts with a digit, ends with a digit, and contains digits in between letters to fully validate the new functionality.
905-905
: Validate serialization of prefixed names with digits in various positionsTo thoroughly test the new regex pattern in
_prefixRegex
, it's important to include test cases that cover prefixed names with digits in different positions within the local part. This includes local names that:
- Start with a digit
- End with a digit
- Contain digits in the middle
Adding these test cases will ensure that the writer correctly serializes all valid prefixed names according to the updated specification.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
src/N3Writer.js
(1 hunks)test/N3StreamWriter-test.js
(1 hunks)test/N3Writer-test.js
(2 hunks)
🔇 Additional comments (2)
test/N3StreamWriter-test.js (1)
44-44
: Test expectation correctly updated to validate digit support in prefixed names.
The change from '<http://a.org/3a> <http://a.org/b#3a> b:a3.\n'
to 'a:3a b:3a b:a3.\n'
properly validates that the StreamWriter now supports digits in prefixed names, which is the main objective of this PR.
Let's verify the test coverage for various digit positions in prefixed names:
✅ Verification successful
Let me analyze the test coverage for digit support in prefixed names based on the search results.
The search results show several test cases that cover digits in prefixed names:
- In N3StreamWriter-test.js and N3Writer-test.js:
['http://a.org/3a', 'http://a.org/b#3a', 'http://a.org/b#a3']
'a:3a b:3a b:a3.\n'
- In N3Lexer-test.js:
'leg:3032571 isbn13:9780136019701'
The test coverage appears comprehensive as it includes:
- Digits at start of local name (3a)
- Digits at end of local name (a3)
- Multiple digits in local name (3032571, 9780136019701)
- Different prefixes with digits (leg:, isbn13:)
Test coverage for digits in prefixed names is sufficient and validates the change.
The test cases properly verify that the N3 writer can handle digits in various positions within prefixed names, which aligns with the PR's objective of allowing digits in the local part of prefixed names.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if we have sufficient test coverage for different digit positions in prefixed names
# Expected: Tests should cover digits at start, middle, and end of local names
# Test: Search for test cases with digits in prefixed names
rg -A 5 "['\"].*:[0-9]|[0-9].*:.*[\n'\"]" test/
Length of output: 82391
src/N3Writer.js (1)
300-300
: LGTM! Regex update allows digits in local names.
The modification to include digits (0-9
) in the first character class of the local name pattern correctly implements support for prefixed names starting with digits (e.g., sdo:3DModel
). This change aligns with the RDF 1.1 Turtle specification.
Let's verify the test coverage for this change:
✅ Verification successful
Regex change is well-tested and compliant
The codebase already has comprehensive test coverage for prefixed names with leading digits:
- N3Lexer explicitly tests cases like
leg:3032571
andisbn13:9780136019701
- Both N3Writer and N3StreamWriter verify serialization of digit-prefixed names with
a:3a
,b:3a
- Tests cover both the lexing and writing aspects of the functionality
The regex modification aligns with existing test cases, confirming it's a valid enhancement rather than a breaking change.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if there are test cases covering the new digit-prefixed functionality
# Expected: Find test cases that verify the serialization of prefixed names starting with digits
# Search for relevant test cases in test files
rg -l "test|describe|it.*['\"].*\d+.*['\"]" "test/"
# Search for actual usage of digit-prefixed names in tests
rg ":\d+" "test/"
Length of output: 682
Script:
#!/bin/bash
# Let's examine the specific test cases in N3Writer-test.js and N3StreamWriter-test.js
# that contain digit-prefixed names to ensure they adequately test the new functionality
# Check N3Writer-test.js content around the digit-prefixed test
rg -B 5 -A 5 "a:3a" test/N3Writer-test.js
# Check N3StreamWriter-test.js content around the digit-prefixed test
rg -B 5 -A 5 "a:3a" test/N3StreamWriter-test.js
# Also check N3Lexer-test.js for the ISBN test case context
rg -B 5 -A 5 "leg:3032571" test/N3Lexer-test.js
Length of output: 1607
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW It's not a bug but a performance consideration. The N3Writer does not aim to exhaustively abbreviate; many other cases aren't implemented either. Happy to have this accepted, but only if performance is not impacted.
It is indeed not a bug (I never implied this or meant to imply it), I merely used Performance impact should be minimal. I created a small test of 8M matches (Mac Silicon M3):
As you can see the impact is minimal: 7ms on 8M runs. Test performed: const assert = require('assert');
function escapeRegex(regex) {
return regex.replace(/[\]\/\(\)\*\+\?\.\\\$]/g, '\\$&');
}
const prefixIRIs = {
rdf: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
rdfs: 'http://www.w3.org/2000/01/rdf-schema#',
xsd: 'http://www.w3.org/2001/XMLSchema#',
owl: 'http://www.w3.org/2002/07/owl#',
dc: 'http://purl.org/dc/elements/1.1/',
dcTerms: 'http://purl.org/dc/terms/',
foaf: 'http://xmlns.com/foaf/0.1/',
schema: 'http://schema.org/',
}
let IRIlist = '', prefixList = '';
for (const prefixIRI in prefixIRIs) {
IRIlist += IRIlist ? `|${prefixIRI}` : prefixIRI;
prefixList += (prefixList ? '|' : '') + prefixIRIs[prefixIRI];
}
const terms = Object.keys(prefixIRIs).map(prefix => prefixIRIs[prefix] + '3DModel');
IRIlist = escapeRegex(IRIlist, /[\]\/\(\)\*\+\?\.\\\$]/g, '\\$&');
let prefixRegex = new RegExp(`^(?:${prefixList})[^\/]*$|` +
`^(${IRIlist})([_a-zA-Z][\\-_a-zA-Z0-9]*)$`);
runPerfTest(prefixRegex, 'v1');
prefixRegex = new RegExp(`^(?:${prefixList})[^\/]*$|` +
`^(${IRIlist})([_a-zA-Z0-9][\\-_a-zA-Z0-9]*)$`);
runPerfTest(prefixRegex, 'v2');
function runPerfTest(prefixRegex, version) {
const TEST = `- Replacing prefixes with RegExp (${version})`;
console.time(TEST);
let count = 0;
const maxRuns = 1000000;
const expectMatch = terms.length * maxRuns
let actualMatch = 0;
while (count < maxRuns) {
// Replace prefixes with RegExp
for (const term of terms) {
actualMatch += prefixRegex.exec(term) ? 1 : 0;
}
count++
}
assert.equal(actualMatch, expectMatch);
console.timeEnd(TEST);
console.log(`* Number of runs: ${count.toLocaleString()}
* Terms matches: ${actualMatch.toLocaleString()}
* Memory usage: ${Math.round(process.memoryUsage().rss / 1024 / 1024)}MB
`);
} |
Thanks @mightymax |
🎉 This PR is included in version 1.23.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
FYI @mightymax - repos like this use semantic release so it helps us version if you use semantic commit conventions in commits and PR titles. |
Fixes issue #469: digits are now correctly used as local names, e.g.:
Summary by CodeRabbit
New Features
Bug Fixes
Tests