I've always had a strange habit of recording down the number of kilometers driven on the petrol receipt. I knew that one day I would use that data, somehow. But this meant I had to be consistent on how I filled my petrol - taking a few more minutes to fill it right to the brim - so every new usage reading had a consistent volume to start with. Also it meant a pile of small chits of paper being collected in the center console of the car. And it meant noting down in pen the KMs used before I start my car on the next full tank.
I've been using my Mazda CX-3 for just over 4 years now, and I gathered quite a stack of petrol receipts. Much to my horror, the thermal paper for the pre 2021 receipts have started to fade. So over the weekend, I took some time to do some data entry and detective work. With a bit of peering under a bright light, its surprising how much information can be recovered.
Over the 4 years I have filled my car just under 50 times. I think I may have lost some receipts, but I think Ive collected at least 95% of the dataset.
Plotting a scattergraph of my litres filled vs distance travelled should show the efficiency of my car and the brand used:
I mainly use Shell or Petronas and the data shows there doesnt seem to be any difference in the petrol efficiency as the advertisements would like to tell you otherwise. However from this data, I can tell empirically that every 10 litres of fuel would take me 114km.
Or programaticaly:
m,c = np.polyfit(x,y,1)
where m == 11.412
So Petronas has almost two thirds of my petrol spending over the past 4 years thanks to contactless payments. However that darn Setel requires me to punch in my phone number which is quite a hassle. Can't it just use my card number?
Additionally its almost impossible to overlay a linear regression line with it, let alone the beautiful Seaborn regression plot (as above.)
m,c = np.polyfit(x,y,1)
where m == 11.412
In terms of usage over the years, my car apparently has become less efficient. Or Im becoming more aggressive in driving.
There is a slight general downward trend in distance per volume. However I will need to comment that during the mid 2020 period where efficiency was high - I was driving down to Cyberjaya very frequently for a project (during the MCO) and because I was working in Putrajaya for almost a decade, Ive learnt how to efficiently cruise down the MEX.
The column chart above shows the change in preference of my petrol vendor of choice. I had a Shell fleet card for 20 years and switched over in 2020. What drove me to use Petronas mid 2020 was the widespread availability of the Amex Paywave with very little brand loyalty to Shell. I don't have to enter a fleet card code, and I get to collect more points via Amex.
There is a slight general downward trend in distance per volume. However I will need to comment that during the mid 2020 period where efficiency was high - I was driving down to Cyberjaya very frequently for a project (during the MCO) and because I was working in Putrajaya for almost a decade, Ive learnt how to efficiently cruise down the MEX.
The column chart above shows the change in preference of my petrol vendor of choice. I had a Shell fleet card for 20 years and switched over in 2020. What drove me to use Petronas mid 2020 was the widespread availability of the Amex Paywave with very little brand loyalty to Shell. I don't have to enter a fleet card code, and I get to collect more points via Amex.
Regarding the plots, I hand coded each one with python's matplotlib. Over the past few years I would like to think I've mastered or at least can adjust any part of its formatting. For this little study, I initially tried using Google's Sheet's charting to generate the visualisations. However I got frustrated with the inability to do simple tasks like formatting, for example its not possible to adjust the sizes of the circles in the bubblechart.
Additionally its almost impossible to overlay a linear regression line with it, let alone the beautiful Seaborn regression plot (as above.)
Even a simple brand preference chart to show the difference colour over the months proved challenging in Google sheets. The best I got was this (which entailed using a pivot table to split the brands by rows):
So while a web based charting UI is good for 98% of the users out there, to achieve effective visualisations, it is best if you learn some code in python and hammer out your own custom charts. The code doesn't need to be pretty as it rarely needs gets run again, but the output sure is effective.
(just double checking that m heads towards zero)
yk.