Ok, first let’s get one thing clear. This is an educational project for me learning some computer vision algorithms (I want to make an insect identification system to protect a bee hive, – but that is for much later article) and the game Elite Dangerous provided only interesting guinea pig here to test some principles. This was never intended as a game cheat/bot or anything like that, although in the last chapter I will give my thoughts on AI becoming a thing playing games using undetectable external “human loops” (e.g. looking at monitor and pushing keyboard) that no anti-cheat will ever catch, but that is way beyond my motivation as I personally like Elite as it is and definitely do not want to destroy its internal mechanics by creating a farming/trading bot this way. That is also the reason why code of my experiments is not disclosed. If I do not figure out some clever way how to share this without some script-kiddies turning this experiment into a cheat I never will share this code. If you are here looking for a game cheat, you will not get it. If you are here to learn how to program a cheat for yourself then kudos to you as I am trying to elaborate in the conclusion parts below, this is not really preventable. But it is your responsibility and try to think hard if your are not destroying the very game you like (but you might learn some skills for your carrier in the process).
So now that this is said, lets look at the interesting challenges of using computer vision (provided by OpenCV library) to analyze flight in Elite Dangerous.
Preamble. Why Elite Dangerous?
Simple, combination of two things, first I like this game and I am familiar with it and secondly this is a game that has a 3D rendered perspective GUI that shifts position, size and rotation during a flight via simulated inertial forces. This makes it a good experiments platform for trying to create for example algorithmes that a physical humanoid robot would face trying to drive a car. Comparing that to a flat 2D GUI, like some old games had, this introduces challenges that I wanted to focus one. E.g. not simply reading the specific instrument, but actually finding that instrument in real time in some 3D space and adjust the reading of that instrument via 3D perspective transformations. Have a look below on the difference between old 1991 game called “F29 Retaliator”, that GUI is simple to find and analyze, worst case you can write per-pixel classification for that. Nothing like Elite has or what you as human experiences when driving a car.
And even worse the Elite’s instruments are partly transparent, which makes it super hard for even the more advanced algorithms to work with because they can be literally blinded by in-game objects. E.g. the navigation target pointer appearing behind a sun, or even cockpit reflections when you are close to a sun literally destroy some classification algorithms (Keypoint family) ability to locate that instrument on the screen. Example below, in that situation nothing can read the top-left panel, not even humans.
So despite all this, there is “something” possible and we will investigate this together below, just keep in mind all that we do here has a “Statistical” success rate. Meaning no classification presented here will work 100% and that is why creating a playing bot is also going to have a statistical results.
Part I. Investigation of basic “of the shelf” computer vision capabilities
As stated before, I wanted to learn CV (computer vision), and after some research I gravited to OpenCV (https://opencv.org/) and because I am lazy when prototyping, to the python wrapper around it (OpenCV-python-Tutorials-page). The python wrapper is using C++ library in the background so it is not such a bad performance as you would expect if this was all vanilla python. So you get the easy rapid prototyping with python CLI, but calling a much more powerful C++ in the background.
Secondly, I do not want to make this article to a detailed OpenCV tutorial, because OpenCV already has absolutely great tutorials, this will just skim the surface by showing main types of algorithms available, so that later we can check how I used them in Elite.
Now what are the basic classes of problems you can learn there:
I.I. HSV color filtering
This is not really any special algorithm, only a trick how to filter colors that really interest you to make your life easier later in analytics. Here the point is to convert from RGB (red-green-blue) to something called HSV (Hue-Saturation-Value).
With image/video in this color representation, it is simple to find those colors (in specific intensity!) that are interesting for you, for our example, I have helped my system here by turning the cockpit GUI into green and then simply filtering it in a series of try-and-error by playing with a few sliders as shown here below.
I.II. Template Matching
This is basically taking an small image as a target, and trying to find it in a larger image by the basic principle of moving the template image over the big image pixel by pixel and compute how much the color palette matches. This is computing intensive, and it can make a lot of false positives, especially if the template image is NOT present, because this algorithm is finding statistically best match, there will always be a best match, even if it is a nonsense.
Here is an example of how this can find a face on a photo, if you provide exactly the same fact as a template (courtesy of the above OpenCV tutorails):
For Elite, this can be applied for finding the exact place where a template image is located, IF you know that the image is definitelly present, for example the NAV compass. The NAV compass has a specific shape that is not changing very much, so we can find it. To make it a bit simpler, we can limit the space in which the algorithm should look for it (this saves a lot of CPU power).
Additionally, if we find the NAV compass, we can cut it away and via HSV filtering, filter the blue dot to determine where it is (simple looking for white color in sectors). Here is result where on live feed we can find the NAV compass exact position, extract it, mask if via HSV and identify if where it is pointing (LEFT vs RIGHT, UP vs DOWN). This is a cool trick will be much useful later.
I.III. Keypoint Matching
The previous process has a biggest disadvantage that that it cannot detect or the presence or absence of the said template, this is where keypoint matching can help. This process takes a template, and tries to identify all the “corners” of the template. So this can detect a template image in bigger image, and also be rotation invariant (Ergo the picture can be rotated) and is partially robust against overshadowing.
For Elite dangerous, this is needed to actually identify different types of messages and objects presence on the screen, to give one of the more simple examples, Here is a video of these algorithms detecting nav beacon (the thing in the center where you want to go) and message to “throttle up” comparing the video to a template.
I.IV Haar Cascade
This is the only neural network image identification that you need to pre-train to find the object you are looking for, in my case, I was experimenting with this family of algorithms in the most complex task I could think of, and that is leaving the station. Which means finding the door out. Now this is much more complex that it sounds.
NOTE to help your own training: And it is extra hard for you as the creator because you need to feed the system with ideally thousands of pictures that contain the object you are looking for. The system will try to train itself based on these picture. And since OpenCV is an open-source with it’s typical “stability” of code and tutorials, I spend much time even learning the tools and picture format/types that it expected, so if you want to follow, I really expect you use this preparation scripts and process from mrnugged on github. This will save you hours!
The haar cascade trainer creates a set of features to look for with specific size and intensity (this is the training result) which can be appliead to the pictures. For simplest example (from OpenCV tutorials again), we have three basic features a – edge, b – line, c – four, the trainer will train for and it will apply them to picture as visible below:
Now in Elite, the results of a simple haar was not really great, but maybe because I only given the trainer several dozens of images and not tousands (come on, there is not enough elite dangerous pictures in google images like that). So I needed to combine it with something else to focus the haar more on the object itself, so I tried combining it with the template matching, together with some rotational and scaling brute-force variations, while haar cascade would then only say yes/no as a last clasification decision at the end.
The results are better than expected for a first try as I am definitely getting more than 50% correct identification, but not yet enough for a successful Elite control (since risk of collision-death here is high). Here is an example from prototype run trying to locate this template of a door exit:
So here is a result of applying haar clasificator (simple training) together with template matching (rotational invariance and scale invariance via brute-force), consider the delays between images, that is how long the classifier is working, so it takes 3-4 seconds for each image. So this is great, but not applicable for live video or actually flying the Elite as a bot 🙁
I.V. Live tracking after object already identified in previous frames
This is a different family again, this time we are not identifying something, but keeping a track (e.g. position on screen) of an object that was previously identified. Again applying the OpenCV family of algorithms, this allows us to gain some framerate as you do not need to constantly identify objects (like in the above Haar classifier it took several seconds), you just keep track of object identified once.
There are several algorithms present, some are better but cannot handle scale changes or rotation, some are worse, but they can actually allow the object to “come closer” to the camera and scale the frame of reference. It takes time to experiment and think about your specific use case, but for our small Elite analysis, I am extending the task of finding the door from above, to simply tracking it.
So here is a simple example where I identified the doors manually with my mouse in the screenshot, then the code was tracking that door in a live video I made. And it was relatively successful (e.g. it tracks it, but imaging an autopilot going for the “center” of the box, on many times the box center is actually on a wall!! So not yet flight ready).
Part II. Creating a long-distance jumping and cruising autopilot
First of all, this is an autopilot, not an AI (in the true sense) and also not a full bot designed to farm or in any way cheat. Since it cannot really operate alone and get out of some dangerous situations alone. Here is a list of remaining things that would need to be solved for this to be considered a cheat, and some of these are much harder then you think:
- Survive interdiction, right now this autopilot if interdicted will try to return to path in full speed and engage FSD, but that is not really enough, no tactical use of countermeasures, and I tried creating a procedure to win the interdiction minigame, but that is not guaranteed at all.
- Get out of stations without dying or getting fined (which would in time make the station hostile)
- Understand the galaxy map to actually travel somewhere (prerequisite 1 to make a trading bot)
- Understand the communities market (text recognition that I haven’t covered yet, as prerequisite #2 to make a trading bot)
- Not hit dangerous stellar objects, I have written a procedure to avoid a sun (which was easy) but there are other nuance problems like hitting a black hole or a planet that is hidden in a shadow, or planets rings. These are hard to detect visually as they are not indicated in the hud in a simple way and are in many forms and shapes.
Sooooo … where can we apply this in our Elite Dangerous with these limits? Well simple, this can still be a relatively simple autopilot to help you make a few jumps to a destination and if combined with an in-game docking computer it can request docking at the very end so you end landed.
II.I – State machine, or “insert real AI here later”
This is the part where I will again emphasize that this is not an AI, because this autopilot is following a set of rules and states that I gave him, so it is a BOT, not an AI. It cannot learn from its mistakes or invent new states. It will be like an idiot continue following the states and conditions even if they were to lead him to a hazardous extraction site with cargo or full speed to neutron star. Nevertheless here are the states I created for a simple autopilot (with the limits mentioned above applied)
II.II – An example run of my simple Elite autopilot
After a long time I decided that we will do this part as a video. The below is a live demonstration of a full flight made of two FSD jumps, super-cruise and station approach.
To call this a proper autopilot, there is one last aspect missing… leaving a station. Despite some nice examples I have given above in the specific algorithms sections, this part of my autopilot remains very buggy, and therefore deadly. Classification by shape and color was successful maybe in 60%/70%, the remaining cases it either decided that another ships engine (because they often have the same color and from some angles also shape ) is the exit door or in some station types some building was a better candidate. Also for bigger ships (I am flying Imperial Cutter), there was just no way how to avoid collisions with other traffic inside the doors. If someone manages to solve this part, I would be happy if they share with me in the comments. For now my conclusion is that this is definitely also possible, but would require a much more robust classification and tracking system, but I assume if someone would really put together thousands of posivite/negative images of station doors, the neural net training would be able to crack this problem with higher success rate ….
Conclusion and thoughts on gaming
Well, my personal conclusion is that these were great skills that will later help me do my other home projects (I just really want that laser turret shooting wasps, but I need these algorithms for it to avoid shooting a bee .. e.g. “insects classification” AI training needed). However after the last 2 months the conclusion is that if there is a pattern in your favorite game that a relatively simple state machine can loop through, and doesn’t require very fast reactions (e.g. if you seen the video, this classification run maybe 8-11 FPS due high CPU load), there is a chance to build a grinding bot for it that will not be detectable by any anti-cheat system. Worst case you take another PC #2, point it’s camera at PC #1 running the game, and hook-up a keyboard interface so that PC #2 can somehow push buttons on PC #1 and you have essentially a “fake human” playing a game. This is what these advanced CV algorithms can do. It is not easy, it has only statistical success-rate, but it might be the next big AAA game economy killer if someone spends the large amount of hours coding+training something like that. …
…then again I do not like grinding in any game (…yeah, coming from a guy who owns Imperial Cutter in Elite, right…) as I do believe that is just lazyness from developers to avoid creating a proper story/tournament/sandbox system and if CV + AI technologies kill these games because it will become too easy to get an AI-bot to play the grinding parts instead of a player .. we might see a future where grinding games will not be developed anymore … who knows.
Feel free to let me know in the comments what you think about the AI vs grinding in games … and please avoid going into the traditional cheats/bots topic (e.g. CS:GO wall-hacks, or game money trainers) as that is really a completely different topic and actually cheats of the game itself via internal memory of a running game client is kind of lame and doesn’t deserve to be validated by even discussing it. Thanks!