top of page
Search
  • tedmanln

Contributing to Timesketch, an Open-Source tool for Cybersecurity Analysis

Figure 1. My system diagram of Timesketch

Introduction:

During March of 2024, I contributed to Timesketch’s open-source code under the guidance of Raja Chattopadhyay, a senior Software Engineer Manager at Capital One. As of this time, Timesketch has nearly 2500 stars and 112 contributors on Github, demonstrating its popularity for users around the web. I was super excited to explore a large codebase, see how hundreds of different files are linked together, and ultimately become a contributor by implementing a solution. Below, I cover the significance of the tool called Timesketch, the process I went about resolving the issue, and what I learned.


Let’s say an intruder hacks into your company’s network and changes a great variety of files, but you don’t know which ones and you don’t know when they were changed. You need a way to organize your server logs and your network traffic, and build a visual graph and analyze the damage. Say hello to Timesketch, a powerful tool to add to your arsenal. You could upload those logs, create tags to filter events, and even generate a visual graph. You could even automate some of these tasks by utilizing Timesketch’s Application Programming Interface (API), with a bit of knowledge of Python and reading their documentation. For example, you could use Python to extract data from your sketches, conduct searches, and create dataframes to report your findings on an online notebook, e.g. a Juypter notebook.


Figure 2. Timesketch running on Github Codespace, a preconfigured environment offered by Git.

Users can start a new sketch by clicking on the blue button “Blank sketch” and view previous sketches at the bottom. Running the client requires Docker and a machine with Ubuntu 22.04.


A Note about API:

Before I explain the bug I worked on, I’d like to explain some key details regarding the API I mentioned earlier. Let’s say you own a website about cats, and you want to display a random cat fact every day. While you could Google a random fact, and change your website every day, it would be pretty inefficient. A better solution would be to automate it by pulling a random fact from a database and making it display a different fact every day. Let’s take a look at Daily Cat Facts. There is a base url, and by attaching /facts to the end of it, you’ll see a list of facts. Try it yourself at: https://cat-fact.herokuapp.com/facts

 

In essence, the API is like a mailman between you and this cat server. To talk with the mailman, you’re going to need to send a letter that’s formatted in a particular way so Daily Cat Facts can understand, like putting on a return address and a forwarding address.


In the world of API, there are specific codes that let the user know whether the request was accepted or refused (due to wrong formatting or improper authentication). Most infamous is error code 404, which means the resource was not found. I cannot stress the importance of returning the correct error code, which was an issue that Timesketch was experiencing.


Figure 3. A visual diagram depicting standard error codes. Source: https://restfulapi.net/wp-content/uploads/HTTP-Error-Codes.jpg

Exploring the Timesketch’s Bug:


Since December of 2023, there was an outstanding issue with Timesketch in which the user would use an incorrect address, but get feedback that it worked. In other words, the user would receive code 200 for OK, rather than a 404 for invalid resource.


Figure 4. Timesketch #3005 Reproduction, displaying code 202 for invalid API endpoint

Above is a screenshot of my reproduction of the issue (through Github codespaces). I’ve placed an invalid endpoint into my browser and used “Inspect”. I then switched over to the Network tab to check the status of the request. And sure enough, I received the status code 200.


Finding the Source:

To solve the problem, I needed to identify the files responsible for the error. My biggest barrier was that there are hundreds of files, some with more than a thousand lines of code. I came up with a few keywords that I could use to search the codebase: API, Errors, HTTP. My search led me to the following files: “app.py”, “routes.py”, and “errors.py”. Additionally, upon examination of the headers, it appears that Timesketch is using something called Flask. My findings are discussed below.



Figure 5. App.py file which adds valid routes using add_resource method

The app file looks promising. It contains bits and pieces of the API endpoint that I am diagnosing, which is “/api/v1”. At line 145, it appears that the prefix “api/v1” to the API endpoint. Additionally, there is a for loop that adds something called routes. I traced the code and arrived at the following file:



Figure 6. Routes.py file which contains all valid routes.

Within the context of this code, it appears that routes are valid API entry points and there are nearly 70 valid routes. These are all the routes being added by that for loop in app.py. I examined the header and realized that Timeksketch is using something called “Flask”. According to Python Basics (pythonbasics.org), Flask is a web framework built around Python. More specifically, it allows users to create web servers on their local computer, using Flask’s custom libraries. Flask is particularly useful when creating an API.



Figure 7. Flask dependencies in App.py

I reference Timesketch’s documentation and sure enough, the developers left some more notes regarding these files:


Figure 8. Source: https://timesketch.org/developers/getting-started/

How do all of these pieces fit together? What is our desired outcome?


Desired Solution:

·       Implement a solution that returns 404 code for invalid endpoints

·       Provide valuable feedback for the user so the user can fix mistakes

·       Make sure the solution does not break anything else (complete unit tests to ensure)

 

To provide a birdeye’s view, I created a custom system diagram representing Timesketch’s tech stack. I worked specifically with running some Docker commands, pytests, and the app.py which is a function of Flask’s API setup. 


Figure 9. Timesketch's tech stack.

Backend:

Flask: The application server.

·       App.py > creates the instance of the API.

·       Routes.py > valid routes added to app.py

·       Python: Resources_test.py > unit test of the solution

 


The Solution:

While it is easy to get lost in the code, let’s not complicate the issue. At its root, the solution is like creating a coffee filter. My role is to stop impurities, while allowing the good stuff through. After conducting research on Flask documentation, I knew that:

  1. I needed to implement a solution in app.py, which is created every time Flask API is called.

  2. The solution had to be placed after all the valid routes were added to the database.

  3. The solution needs to be readable and be properly commented

  4. Thanks to Flask, I could use @app.route decorator to intercept all routes

  5. By using <path:path>, I could intercept any length of string, no matter if “/” was used.

  6. we want to catch: host:url/api/v1/ invalidroute

  7. but also catch: host:url/api/v1/ invalid / route


Here is the final implementation of my code:


Screenshot of the app.py, from Pycharm.

Figure 10. From Pycharm, implementing a solution that provides the user with error code 404 via jsonify

As you can see:

1.     If there is any path other than the valid routes, it would arrive to my method.

2.     In addition, it returns an error message that lets the user know to recheck their endpoint. 

3.     A concise comment is left to inform future programmers of the method (highlighted in green)

 

To use these methods, include in the header section:

from flask import jsonify

from flask_login import login_required


Note that @login_required is used, in addition to @app.route. @login_required is a Flask decorator that ensures that the user is logged in before this check occurs.


If the user is not logged in:

  • The user should be redirected to the login page first and foremost

If the user is not logged in but @login_required is NOT used:

  • The user is instead informed that they mistyped their address (even though it may be a valid endpoint!) You can see how this would be confusing. Furthermore, they aren’t redirected.


After pitching the code to my team, we performed the following functional tests.


Functional Tests:




Figure 11. Test Case #1 Results

Figure 12. Test Case #2 Results


Figure 13. Test Case #3 Results


Figure 14. Test Case #4 Results

Unit Tests:

It was at this point my mentor Raja Chattopadhyay stressed to me the importance of doing unit tests. Unit tests allow programmers to automate the checking process, so they wouldn’t have to do these functional tests each and every time. In addition, it proves to other programmers that they have done their due diligence in certifying their own code.


To conduct a unit test, our team and I thought it was best to be placed in the resources_test.py, which tests the validity of proper routes. My colleague came up with the following fix:



Figure 15. A prototype of our unit test

This is a good start. The assert404 method belongs to the Flask library and takes in a JSON body to check if the code received is a 404. If it is not a 404, the test will fail and the programmer will receive an error message after compilation.

 

I had the following suggestions:

  1. Place the test in its own class to make it obvious that it different from the other resource checks

  2. Store the URL in its own variable to make it more obvious that it is an invalid resource


With my suggestions, this is the final revision of the code:



Figure 16. Final revision of our unit test.

To run the tests in resources_test.py, run the following command in your terminal, inside the root repository:


$ python3 -m pytest ./timesketch/api/v1/resources_test.py


Note: You may encounter errors if certain dependencies are not installed. For any modules not installed, install them with: $ pip install


$ pip install modulenamehere


Here are the results of running our test:



Figure 17. Unit Tests results in conjunction with other tests

Even though our test passes, we need to test the test.


To test the validity of the test, I commented out our code to see if the test would pass or not. If the test works, it should fail.


Figure 18. First, comment out the new code in app.py

Then, run the exact same test and look at the results.


Figure 19. Test result demonstrating that code now fails. Received 200 instead of 404.

With both functional and unit tests passing, our team and I looked over Timesketch’s style guide to ensure it was up to code. Then I finalized the commits and submitted the pull request.


Acknowledgements:

  • My mentor Raja Chattopadhyay for his insights in regards to expanding my network, leveraging LinkedIn, and pursuing cloud certification to supplement my knowledge.

  • CodeDay and CTI for the connecting me with this opportunity

 

8 views0 comments
bottom of page