Skip to content

exp-algo: Louvain community detection (experimental)#67

Open
shahar-biron wants to merge 4 commits intomainfrom
exp-louvain
Open

exp-algo: Louvain community detection (experimental)#67
shahar-biron wants to merge 4 commits intomainfrom
exp-louvain

Conversation

@shahar-biron
Copy link
Collaborator

@shahar-biron shahar-biron commented Feb 9, 2026

Adds experimental Louvain community detection under flex.exp.louvain.

Highlights:

  • New UDF: flex.exp.louvain({nodes: collect(n), ...})
  • Weighted edges via weight relationship property (default 1)
  • Integration test demonstrating a realistic weighted community scenario
  • Documentation with an explicit EXPERIMENTAL warning

Also includes CI stability fixes for the existing integration test suite:

  • Grant required checks: write + pull-requests: write permissions
  • Force jest-coverage-report-action to run Jest in-band to avoid flakiness

Co-Authored-By: Warp agent@warp.dev

Summary by CodeRabbit

  • New Features

    • Added an experimental Louvain community-detection algorithm for graph clustering.
  • Documentation

    • Added comprehensive docs, examples, and an "Experimental Graph Algorithms" section in the README.
  • Tests

    • Added integration tests validating Louvain on weighted graphs, including a scalable long-running scaffold.
  • Chores

    • Improved CI test reliability by adjusting workflow permissions and running Jest in single-threaded mode to reduce flaky parallel test failures.

shahar-biron and others added 3 commits February 9, 2026 12:47
Grant checks/pull-requests write permissions and force jest-coverage-report-action to run Jest in-band.

Co-Authored-By: Warp <agent@warp.dev>
Introduce flex.exp.louvain (experimental) and an integration test demonstrating community detection on a weighted graph.

Co-Authored-By: Warp <agent@warp.dev>
Add documentation for flex.exp.louvain and clearly mark exp-algo functions as experimental.

Co-Authored-By: Warp <agent@warp.dev>
@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

Adds an experimental Louvain community-detection feature (implementation, docs, and integration tests) and updates CI to force single-threaded Jest runs and grant write permissions for checks and pull-requests.

Changes

Cohort / File(s) Summary
Workflow Configuration
.github/workflows/test.yml
Grant top-level write permissions for checks and pull-requests; run Jest with npx jest --runInBand --forceExit to avoid flaky parallel integration tests.
Documentation
README.md, docs/README.md, docs/exp-algo/louvain.md
Add "Experimental" function category entries and a new "Experimental Graph Algorithms" section; introduce detailed docs for exp.louvain (syntax, params, returns, examples, notes, experimental warning).
Algorithm Implementation
src/exp-algo/louvain.js
New full Louvain implementation: undirected adjacency building via graph.traverse, edge-weight handling, one-level local moves, graph coarsening, multi-level iteration, partition/community utilities, falkor.register('exp.louvain', louvain), and module.exports for testability.
Integration Tests
tests/exp-algo/louvain.test.js
Add integration tests that seed a weighted 3-team graph, assert three communities of size four, verify Node export, and include an optional large-scale long-running test gated by FLEX_LONG_TESTS / FLEX_LOUVAIN_SCALE.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant FalkorUDF as FalkorDB UDF
participant Graph as GraphStore
participant Louvain as LouvainAlgo
Client->>FalkorUDF: call exp.louvain(nodes, params)
FalkorUDF->>Graph: graph.traverse(nodes, direction, maxEdgesPerNode)
Graph-->>FalkorUDF: edges & nodes stream
FalkorUDF->>Louvain: buildUndirectedAdjacency(edges)
Louvain-->>Louvain: oneLevel + inducedGraph + coarsen loop
Louvain-->>FalkorUDF: partition, communities, levels (optional debug)
FalkorUDF-->>Client: return result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 I hopped through nodes and followed weighty trails,

Louvain curled up where community prevails,
I nudged edges gently, coarsened every day,
Found three warm clusters where small rabbits play,
Experimental whiskers twitch — hooray! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding Louvain community detection as an experimental algorithm to the flex.exp namespace.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch exp-louvain

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
tests/exp-algo/louvain.test.js (2)

27-27: Consider adding an explicit timeout for this integration test.

This test performs multiple round-trips to a live database (cleanup, graph creation, two queries, and the Louvain UDF call). The default Jest timeout (5 seconds) may be too tight, especially in CI. The long-running test on Line 226 already sets a generous timeout — consider adding one here too (e.g., 30–60 seconds).


162-189: Consider batching node/edge creation for very large scales.

For the default scale=1000, the UNWIND on Line 163 produces 12,000 rows in a single transaction, and the chain query (Line 169) creates ~4,000 edges per team in one go. This works, but if anyone bumps the scale significantly higher, they may hit memory or transaction-size limits. Since this is an opt-in stress test, it's fine as-is — just worth a brief note in the comment block (Lines 138-145) about practical upper bounds.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

Coverage report

St.
Category Percentage Covered / Total
🟡 Statements
60.32% (-28.14% 🔻)
415/688
🔴 Branches
43.99% (-28.94% 🔻)
194/441
🟡 Functions
78.87% (-21.13% 🔻)
56/71
🟡 Lines
62.63% (-30.36% 🔻)
372/594
Show new covered files 🐣
St.
File Statements Branches Functions Lines
🔴 exp-algo/louvain.js 0.45% 0% 0% 0.51%

Test suite run success

79 tests passing in 27 suites.

Report generated by 🧪jest coverage report action from 549c075

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/exp-algo/louvain.js`:
- Around line 118-165: The code currently accumulates weights into
undirectedWeightByKey (using undirectedWeightByKey.set(key, (get||0) + w)),
causing duplicate counting when the same edge is seen from both traversal
directions; change this to deduplicate by using a first-write-wins (or Math.max)
strategy: when computing the canonical key via makeUndirectedEdgeKey and
updating undirectedWeightByKey, set the value only if the key is not already
present (or replace with Math.max(existing, w)); apply the same change to the
fallback edge-handling path that also writes to undirectedWeightByKey, and leave
selfLoopWeightById behavior unchanged for true self-loops.
🧹 Nitpick comments (4)
.github/workflows/test.yml (1)

33-54: Tests run twice with different parallelism settings — first run may still be flaky.

Line 36 runs npm test -- --coverage (likely with Jest's default parallel workers), while the coverage-report action at Line 49 re-runs tests with --runInBand. If the flakiness stems from parallel execution against a single FalkorDB instance, the first run (Line 36) is still susceptible. Consider adding --runInBand to the first run as well, or consolidating into a single test execution that both checks coverage thresholds and feeds the report action.

Proposed fix: make the first run consistent
       - name: Run Tests with Coverage
         # Because coverageThreshold is in jest.config.js, 
         # this step will fail if coverage is < 90%
-        run: npm test -- --coverage 
+        run: npm test -- --coverage --runInBand --forceExit
         env:
           DEBUG: "true"
src/exp-algo/louvain.js (1)

194-207: Edge-key split on '::' is fragile if node IDs ever contain that substring.

makeUndirectedEdgeKey joins with '::', and this code splits on '::' to recover endpoints. If a node ID is a string containing '::', the split will produce the wrong parts. Currently safe with integer FalkorDB IDs, but worth a defensive note or a separator that cannot appear in stringified IDs (e.g., a NUL character, or storing endpoints separately).

docs/README.md (1)

108-116: Experimental section is well-documented with appropriate warnings.

One minor omission: the "Browse by category" section (around Line 122–128) doesn't include a link for Experimental Graph Algorithms, unlike every other category. Consider adding one for discoverability:

 - [Bitwise Functions](./docs/README.md#bitwise-functions-flexbitwise) - Low-level bit operations
+- [Experimental Graph Algorithms](./docs/README.md#experimental-graph-algorithms-flexexp) - Community detection and more

(in the root README.md)

tests/exp-algo/louvain.test.js (1)

27-136: Solid integration test with clear community structure.

The 3-team graph with strong intra-team (weight=10) and weak cross-team (weight=1) edges is a good realistic scenario. The validation logic correctly treats community IDs as opaque and checks membership by name.

One observation: this only exercises direction: 'incoming'. Consider adding a small test with the default direction (both incoming and outgoing) to catch regressions related to bidirectional traversal — especially given the weight-accumulation behavior flagged in the implementation review.

Comment on lines +118 to +165
// First collect undirected edges into a canonical key -> weight map.
// This avoids double-counting when the same connection is observed from
// multiple traversal directions / endpoints.
const undirectedWeightByKey = new Map();
const selfLoopWeightById = new Map();

const debugEdgeCounts = debug ? Object.create(null) : null;

const directions = Array.isArray(direction) ? direction : [direction];

for (const dir of directions) {
for (let i = 0; i < nodes.length; i++) {
const current = nodes[i];
const currentId = getNodeId(current);
if (typeof currentId === 'undefined') continue;

const reachables =
graph.traverse([current], {
direction: dir,
returnType: 'edges',
}) || [];

const edges = reachables[0] || [];
if (debugEdgeCounts) {
debugEdgeCounts[String(currentId)] = edges.length;
}
const edgeCount = Math.min(edges.length, maxEdgesPerNode);

for (let e = 0; e < edgeCount; e++) {
const edge = edges[e];
const w = getWeight(edge);
if (!(w > 0)) continue;

const s = edge ? edge.source : undefined;
const d = edge ? edge.destination : undefined;
const sId = s != null ? getNodeId(s) : undefined;
const dId = d != null ? getNodeId(d) : undefined;

// Preferred path: use both edge endpoints directly.
if (typeof sId !== 'undefined' && sId !== null && typeof dId !== 'undefined' && dId !== null) {
if (sId === dId) {
selfLoopWeightById.set(sId, (selfLoopWeightById.get(sId) || 0) + w);
continue;
}
const key = makeUndirectedEdgeKey(sId, dId);
undirectedWeightByKey.set(key, (undirectedWeightByKey.get(key) || 0) + w);
continue;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Edge weights are accumulated (not deduplicated) across traversal directions — the comment is misleading.

When direction = ['incoming', 'outgoing'] (the default), the same physical edge A→B is discovered once from A's outgoing traversal and once from B's incoming traversal. Both produce the same canonical key, but since the weight is summed ((get || 0) + w), the effective weight doubles.

For a uniform graph this doesn't affect community assignments (the modularity gain formula scales linearly), but if some edges are only visible from one direction (e.g., an endpoint is outside the nodes list), scaling becomes non-uniform and can distort results.

Options:

  1. Use Math.max (or a first-write-wins guard) instead of accumulation to truly deduplicate.
  2. Keep accumulation but fix the comment and document that the default bidirectional traversal intentionally doubles weights.
Option 1: deduplicate with first-write-wins
           const key = makeUndirectedEdgeKey(sId, dId);
-          undirectedWeightByKey.set(key, (undirectedWeightByKey.get(key) || 0) + w);
+          if (!undirectedWeightByKey.has(key)) {
+            undirectedWeightByKey.set(key, w);
+          }
           continue;

Apply the same pattern to the fallback path at Line 178.

🤖 Prompt for AI Agents
In `@src/exp-algo/louvain.js` around lines 118 - 165, The code currently
accumulates weights into undirectedWeightByKey (using
undirectedWeightByKey.set(key, (get||0) + w)), causing duplicate counting when
the same edge is seen from both traversal directions; change this to deduplicate
by using a first-write-wins (or Math.max) strategy: when computing the canonical
key via makeUndirectedEdgeKey and updating undirectedWeightByKey, set the value
only if the key is not already present (or replace with Math.max(existing, w));
apply the same change to the fallback edge-handling path that also writes to
undirectedWeightByKey, and leave selfLoopWeightById behavior unchanged for true
self-loops.

Add a long-running Louvain scaling test (12 * scale nodes; scale=1000 => 12k nodes) gated behind FLEX_LONG_TESTS=1.

Co-Authored-By: Warp <agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant