Machine learning
- PyTorch Geometric is an extension of PyTorch to train GNNs on structured data. Their introduction by example page and list of collab notebooks (and associated videos) are very pedagogic and interesting to experiment with.
- Content in french: videos of classes and seminars on the topic of Natural Language Processing. Classes given by Benoît Sagot.
Cybersecurity
Conferences deadlines
This website summarize the upcoming cybersecurity main conferences with a countdown to their submission deadlines: sec-deadlines.
SQL Injection datasets
For training of SQL Injection detection classification models, the following datasets can be used:
- WAF-A-MOLE Labeled SQL statements dataset used in paper “WAF-A-MoLE: Evading Web Application Firewalls through Adversarial Machine Learning”.
- Kaggle SQL Injection labeled dataset. This is the cleaned version of dataset which is also widely used in the domain.
- Libinjection malicious payloads dataset.
CTFs
Some random tools.
- Webservices / social medias account finder: sherlock
- PrivEsc vulnerability scanner: linPEAS
- Online image analysis: Aperisolve
- SQL Injection detection: sqlmap, OWASP Zap
Software engineering:
- Commit messages specification: conventionalcommits
- Word, Excel, Powerpoint to markdown: markitdown
- A static analysis tool: semgrep
- A pdf files manipulation tool: Any administrative procedure requires to manipulate (merge, sign, reorder…) pdf files. I do not feel confident on relying on unknown online services. Stirling-PDF allows to locally deploy docker instances of a web app to manipulate these. If on a rush, framalab provides an instance at https://stirling-pdf.framalab.org/.