Hi HN! I built pyscn for Python developers in the vibe coding era.
If you're using Cursor, Claude, or ChatGPT to ship Python code fast, you know the feeling: features work, tests pass, but the codebase feels... messy.
Common vibe coding artifacts:
• Code duplication (from copy-pasted snippets)
• Dead code from quick iterations
• Over-engineered solutions for simple problems
• Inconsistent patterns across modules
pyscn performs structural analysis:
• APTED tree edit distance + LSH
• Control-Flow Graph (CFG) analysis
• Coupling Between Objects (CBO)
• Cyclomatic Complexity
Try it without installation:
uvx pyscn analyze . # Using uv (fastest)
pipx run pyscn analyze . # Using pipx
(Or install: pip install pyscn)
Built with Go + tree-sitter. Happy to dive into the implementation details!
Since you mentioned the implementation details, a couple questions come to mind:
1. Are there any research papers you found helpful or influential when building this? For example, I need to read up on using tree edit distance for code duplication.
2. How hard do you think this would be to generalize to support other programming languages?
I see you are using tree-sitter which supports many languages, but I imagine a challenge might be CFGs and dependencies.
I’ll add a Qlty plugin for this (https://github.com/qltysh/qlty) so it can be run with other code quality tools and reported back to GitHub as pass/fail commit statuses and comments. That way, the AI coding agents can take action based on the issues that pyscn finds directly in a cloud dev env.
reply