Some interesting projects

Machine learning

PyTorch Geometric is an extension of PyTorch to train GNNs on structured data. Their introduction by example page and list of collab notebooks (and associated videos) are very pedagogic and interesting to experiment with.
Content in french: videos of classes and seminars on the topic of Natural Language Processing. Classes given by Benoît Sagot.

This website summarize the upcoming cybersecurity main conferences with a countdown to their submission deadlines: sec-deadlines.

For training of SQL Injection detection classification models, the following datasets can be used:

WAF-A-MOLE Labeled SQL statements dataset used in paper “WAF-A-MoLE: Evading Web Application Firewalls through Adversarial Machine Learning”.
Kaggle SQL Injection labeled dataset. This is the cleaned version of dataset which is also widely used in the domain.
Libinjection malicious payloads dataset.

Some random tools.

Commit messages specification: conventionalcommits
Word, Excel, Powerpoint to markdown: markitdown
A static analysis tool: semgrep
A pdf files manipulation tool: Any administrative procedure requires to manipulate (merge, sign, reorder…) pdf files. I do not feel confident on relying on unknown online services. Stirling-PDF allows to locally deploy docker instances of a web app to manipulate these. If on a rush, framalab provides an instance at https://stirling-pdf.framalab.org/.