I. Basics

Discuss the topic with me.
The preferred programming languages are Kotlin or Python. Talk to me if you want to use any other language (Java, C++, C#, are generally ok).
Tip: Use LLM models/agents (ChatGPT, Jetbrains Juenie, Cursor, etc) to help you with bits of code, e.g., with processing Pandas data frames. Make sure you verify the answers.
The review process takes some time. Therefore, I need to receive your pull request at least 2 weeks before you want a grade. During the summer holidays, reserve at least 4 weeks for this (unless agreed otherwise).

II. Git and Pull Request

The code must be submitted via a pull request in a git repository.
1. Create an empty repository in Bitbucket, GitHub, or MFF's GitLab
2. Create a new branch (this is the so-called feature branch; give it any name you want)
3. Commit all your code to that branch (not to the main/master branch)
4. Create a Pull Request
5. Add me as a reviewer (jhana for bitbucket, jirka-x1 for GitHub; you need to add me as a collaborator first)
6. Send me a non-automated email about this with the subject "NPFL128 Project<your-first-name><your-last-name><assignment-number>"
You have to react to all my comments:
1. reply "fixed" if you agree and fixed it;
2. ask for more detail, or
3. disagree.
  But don't just fix or ignore it silently.
Do not mark my comments as resolved. The reviewer decides whether they are resolved or not. It is really hard to see how they were addressed when they are marked as resolved.

There is a readme file in the root (use markdown or plain text). It must specify
1. What the project is solving (add a reference to the relevant paper)
2. Scope - what you implemented and what is left for future
3. How to run the project
4. Add a file with sample output if appropriate
Do not use jupyter notebooks, write regular production-like code that can be executed from a command line with appropriate parameters (the notebook can be added to provide an example of use, but it must call the regular functions you define)
Split your code into functions. Write functions that
1. each solve a single well-defined task
2. are documented
3. are testable; bonus points for unit tests
4. ideally, the output is via the return value only. If any parameters have to be modified, document it
Use self-explanatory variable and function names
Document all functions (what they do, what is input, what is output); add other comments as appropriate
You must use appropriate libraries to handle JSON, XML, csv, etc. Do not write the code yourself.
All files end with a newline (see: this)

Strictly follow the PEP8 code style; a reasonable IDE (e.g., IntelliJ IDEA, PyCharm, MS VS Code) checks this for you.
Use type hints for function parameters and return values. First, type hints make it easier to understand your code. Second, they help you do avoid many errors (a reasonable IDE or a type checker, e.g., http://www.mypy-lang.org/ will do the code analysis). Use them at least for basic types (str, int, List, Set, Iterable, Tuple, Optional, Mapping, Dict, ...). You do not need to use them for complex types of third-party libraries where it is not clear what the type is.
Include a pyproject.toml, requirements.txt, or setup.py file so I can easily install third-party libraries. Make sure that all libraries have their versions specified.
Use the main function pattern
Use argparse to parse arguments, pathlib's Path for all files and directories, Counter to count objects.
Use with to manipulate files.