MCH2022

Building a stream-based NLP (Natural Language Processing) app to monitor vulnerabilities realtime
2022-07-24, 14:00–15:00, Area 42 Workshops

This workshop will be held in our workshop tents located on Backbone Boulevard between Liskov field and Flower field.

We'll set up a stream-based Python app to monitor new vulnerabilities by using NLP (Natural Language Processing) in realtime. We'll be experimenting with some basic NLP using spaCy to monitor when vulnerabilities may go trending. Using Faust (stream processing) we'll monitor RSS feeds, tweets and the NVD database and extract important keywords using basic Natural Language Processing.


We'll be experimenting with some basic NLP to monitor when vulnerabilities may go trending. Using Faust (stream processing) we'll monitor RSS feeds, tweets and the NVD database and extract important keywords using basic Natural Language Processing. The results will be saved in ElasticSearch where you'll be able to create fancy graphs about what's currently trending!

Virtual machines running most of the components will be made available. The VMs will run:
- Kafka (Zookeeperless)
- ElasticSearch
- Kibana
- Postgres
- Miniflux

(Basic) Python knowledge will be required for debugging. The sourcecode is ready-as is so we'll only have to install dependencies and start running :) Some bonus content may include (if time allows) the same stack to monitor for phishing campaigns by monitoring certificate transparency

PS. Will be to facilitate about 6 people. I'll be able to organise multiple sessions if there is more interest :)

Materials used:
- https://github.com/d3vzer0/mch2022-workshop-streaming (primary Faust sourcecode to watch twitter/rss/nvd database and save results to ElasticSearch)
- https://github.com/d3vzer0/mch2022-workshop-nlp (the api exposing our basic keyword extraction script)
- https://github.com/d3vzer0/mch2022-workshop-cloud (the terraform/ansible code I used to deploy the environment)

Workshops organized by the Area42 team.

Out village is located behind the main music stage, the workshop tents are located on Backbone Boulevard between Liskov and Flow fields.

This speaker also appears in:

Python hobbyist and Security Engineer @ Schuberg Philis. Mostly involved with all blue-team related things and data engineering :)