Throw out obviously bad data?
Would it be possible for Fuelly to throw out extreme outliers when computing averages? For example there is a guy with a 2014 Mazda 6 that put in 11,051 gallons on a refuel. Obviously that was a mistake. Does it make sense to say that 2014 Mazda 6s get an average of 9.0 MPG? Heck no.
I'm not saying go in and change that guys data, just don't use it to calculate averages. |
If you clicked the report page link on that vehicle page, (or post a link to it in this thread) I can take a look, and if it truly is out of the realm of possibility, I can see what I can do to fix or delete it.
|
I think what is actually asking for is that instead of having to report every suspect vehicle, and then have you guys fix it, that the software just ignore the data outliers when researching vehicle averages.
|
We are looking at way to stop/catch bad data at the point of it being entered. But if we put up a blocker, it can become frustrating at the time of entry. If we give the option to ignore the warning/error message, they can be overlooked and forgotten.
The Team is still looking for the best possible solution, because we want to see good-data just as much as you! |
Perhaps allow the user to select using the middle 95% of vehicles for calculating the average fuel economy of a group of cars.
|
We have been actively working on ways to identify outliers, mark them as potential errors and remove them from the crowdsourced results. I don't have an ETA for this as our developers are currently working on some projects that need to get done for our smartphone apps but hopefully in the next couple months we'll be able to implement a much more sophisticated tool to identify and ignore bad data.
Not only do we plan to use the data to identify outliers but also to understand what are "normal" results on a Year, Make, Model basis and then analyze data when it's being input into the system to identify issues before they are submitted. For example, this 2014 Mazda 6 probably entered their mileage in the gallon field and we should be able to identify that and fix it before the data is saved to our database. Sometimes people forget to log a fuel-up and then on their next fuel-up the distance traveled is usually about twice that of a single fuel-up. Right now we require them to tell us that they skipped a fuel-up but we should be able to use the data we have to identify issues like this and help our users fix the issue. We have been analyzing statistical methods to identify outliers, to determine what the max capacity of a fuel tank, the max distance per tank and other interesting tidbits. We are also gathering data on fuel prices and locations so we can determine if someone accidentally put the fuel price in the gallon field and the gallons in the fuel field even on fairly small fuel-ups. Once we have developed these methods we will retroactively scan our entire database for issues and mark them accordingly. We will then highlight the fuel-ups on the logs so our drivers can see that they have a potential error. In the meantime, you can paste a link to the vehicle with the error here in the forums and we can set that vehicle to be ignored. |
Here is the specific one I am referring to:
https://www.fuelly.com/car/mazda/6/2014/threewest/243719 I really think there should be some kind of automatic filtering if someone puts in 11 thousand gallons of fuel to throw that one out, at least in regards to the group average. Honestly it might not be a bad idea to just throw out a certain top percent and lowest percent as far as the averages are concerned. |
You can't throw out the top ones, that would be me :(
I don't lie honest... But seriously, you could remove the extremes to get the average. One of the problems though is that there are so many categories of vehicles, that most of the averages are only based on a handful of cars, this is always going to allow the extremes to influence the average more than they should. Oliver. |
After looking at that fuel log, and comparing the erroneous entry with others for the same vehicle, it was obvious that the user forgot to enter the decimal point when entering their gallons. I've corrected the error. If you see any more just use the report page function. I try to keep on top of those so they don't get backlogged like they have in the past.
|
Quote:
|
Quote:
|
Sorry to bump an old thread, but there is a converse to this. I have the only manual transmission listed in my model year. The other cars are getting about 22 mpg, I get about 28-29. Every one of my fill-ups has been thrown out as an outlier.
1989 Saab 900. |
Quote:
|
It hasn't been thrown out, that's a poor choice of words. On the chart that shows the distribution of mileage by fill-up, mine are all counted as outliers. I see the number of outliers incremented each time I add data.
The issue is that as the only standard in the group, I'm about 40% high and the algorithm flags the data. |
1989? Gee... The computer knows that's when dinosaurs ruled the world.
|
Quote:
|
You only have 5 fillups so far. I think after you've been using Fuelly for a while, they will probably start showing up on the chart, and not be cast as an outlier.
|
Quote:
I assume that the application is using standard deviation to determine the outliers, but the problem for the '89 900 is that about 3/4 of the fillups come from a single vehicle and it's running 19.3 MPG (US). This vehicle should show a double peak (one for the standards at about 28 MPG, another for the autos at about 21), not a standard bell curve. |
11 fillups now - all have been pitched from the dataset as outliers.
It looks to me as though any given fill is checked against the rest of the data at the moment of entry and, if flagged as an outlier, never checked again. This saves processing time but ignores any emerging patterns in the data. The '89 Saab 900 has almost 10% of the data marked as outliers. Is there a process to periodically reanalyze the data? If not, there should be. |
There are links on the admin side to make Fuelly redo the math on your vehicle, but when I click on them, I get a no permission message. Maybe someone with higher admin permissions than me can try?
|
The other option is to manually set them all to 25 mpg so they register, then see whether I can increase them bit by bit back to 26, 27, 28, 29.... :)
Fortunately I keep my own record of data. |
Quote:
|
You're vehicle is being considered an outlier simply because of the lack of other users/vehicles that match yours mechanically. Even if just comparing the 2 vehicles that are "Hatchbacks", your 11 fuelups at 29MPG vs the other user's 240 fuelups at 20MPG make you an outlier.
The system isn't ruling your data as wrong, or invalid. It's simply just much higher than the averages of the other users with same year/make/model. If, for example you had 100 more fuel ups, you'd move from being an outlier, statistically. Maybe if our system allowed a filter for transmission type, that'd help in this situation. Something we'll need to look at in a future update. Quote:
If you just add (or rather, when you have) more fuel ups, the graphs will create that double peak... like seen here: https://www.fuelly.com/car/volkswagen/jetta/2015 It's also worth noting that the data generates fresh every hour, therefore no fuelup is an outlier until that hourly query says it is. Quote:
|
So what determines an outlier? >2 standard deviations?
|
https://en.wikipedia.org/wiki/Interquartile_range
Quote:
|
Quote:
|
There are lots of obvious errors that then get washed into a poor overall average.
Look at this for example 2011 Town & Country Ltd. (Chrysler Town & Country) | Fuelly what should be a great set of data with supposedly 70,000 miles logged. The average seems great but start to look at the individual entries and you see a more normal 20 MPG. Go way back and suddenly 250 mpg tanks start showing up. Look at the best tank, 320 mpg. unless you are some kind of special prototype, or a scooter, any tank over 100 mpg should be thrown out. I would bet any tank that is over 3 times the average of the model is some kind of error that should be flagged and not included in any overall averages. |
Newbie here, hopefully not posting in the wrong thread. In my research for a new vehicle I stumbled upon this profile (Chevy Pickup (Chevrolet Volt) | Fuelly) which is responsible for over 1/3 of the fill-ups for the listed model however it is obviously a truck and not a Volt (given the Volt's fuel tank is less than 9 gallons and the entered fill-ups are for well over 20 gallons). It is another example of an "outlier" that is harshly affecting the average fuel economy for the group.
|
Quote:
|
Somewhere, shortly after I started using Fuelly, I must have missed a fillup, since all of a sudden my mileage went over 100mpg. Is there any way to delete whatever happened to get the mileage more in line with real life? Page is at Fuelly - Track and Compare your MPG
|
Quote:
|
Using odometer.
|
I deleted ALL the fill-ups and will be starting over from 'scratch'.
|
There are two obviously incorrect vehicles that currently are responsible for more than 1/4 of the 2017 Honda CR-V miles in the system, yet show fuelups for several years. It seems the users involved have reclassified some older car as a 2017 CR-V. They are dropping the reported mileage in a significant way:
Van (Honda CR-V) | Fuelly CRV (Honda CR-V) | Fuelly |
Does anyone from Fuelly actually read this forum?
Take a look at the 2017 CR-V fuel-up statistics: 2017 Honda CR-V MPG - Actual MPG from 73 2017 Honda CR-V owners You will see 2 bell-shaped curves side by side; the one on the left from the 2 cars that aren't 2017 CR-Vs, and the one on the right from cars that are. If manual intervention isn't allowed, at least you could filter out fuel-ups from prior to the model year (or even 2 years prior to the model year) as an algorithmic way of removing bad data. |
I agree with Larryd. I was researching the 2017 CRV and noticed the obviously misplaced rogue vehicles. Can the moderators not clean up the CRV entries so the average MPG can be compared to other years?
|
S report buttons on vehicle's profile page would be helpful.
|
Quote:
I agree, I found this "BMW" just now and was looking for a way to report it just so the data could be more trustworthy: My Grand Cherokee (BMW 330Ci) | Fuelly |
Goes both ways. I got an honest high mileage in my Civic, but as an outlier the algorithm tosses mine out as most are getting significantly less that I am, working for the mileage. Is what it is, I know I'm honest but there are folks who would fluff up their mileage or not pay attention to the numbers they're inputting.
|
Then there are the people who get horrible mileage simply because of their regular driving cycle. A short commute in which the car can never fully warm up will drag down the efficiency of even the best hybrids.
There low fuel efficiency isn't do to the car. So that data is also tossed in order to not give the impression the car model is that bad. When you drill down while researching a car model, you will eventually get an option to view the outliers. |
All times are GMT -8. The time now is 12:19 AM. |
Powered by vBulletin® Version 3.8.8 Beta 1
Copyright ©2000 - 2024, vBulletin Solutions, Inc.