Throw out obviously bad data? - Fuelly Forums

Android Users - Coming Soon! - Migrating from aCar 4.8 to 5.0

Reply
 
Thread Tools Display Modes
 
Old 09-03-2014, 05:19 PM   #1
Registered Member
 
Join Date: May 2009
Posts: 4
Country: United States
Lightbulb Throw out obviously bad data?

Would it be possible for Fuelly to throw out extreme outliers when computing averages? For example there is a guy with a 2014 Mazda 6 that put in 11,051 gallons on a refuel. Obviously that was a mistake. Does it make sense to say that 2014 Mazda 6s get an average of 9.0 MPG? Heck no.

I'm not saying go in and change that guys data, just don't use it to calculate averages.
__________________

evoblade is offline   Reply With Quote
Old 09-03-2014, 06:46 PM   #2
Site Team / Moderator
 
Jay2TheRescue's Avatar
 
Join Date: Sep 2008
Location: Northern Virginia
Posts: 4,657
Country: United States
Location: Northern Virginia
If you clicked the report page link on that vehicle page, (or post a link to it in this thread) I can take a look, and if it truly is out of the realm of possibility, I can see what I can do to fix or delete it.
__________________

__________________




Jay2TheRescue is offline   Reply With Quote
Old 09-04-2014, 04:45 AM   #3
Registered Member
 
Join Date: Nov 2007
Posts: 1,445
Country: United States
Location: north east PA
I think what is actually asking for is that instead of having to report every suspect vehicle, and then have you guys fix it, that the software just ignore the data outliers when researching vehicle averages.
trollbait is offline   Reply With Quote
Old 09-04-2014, 09:19 AM   #4
Registered Member
 
RobertV's Avatar
 
Join Date: Feb 2013
Posts: 1,900
Country: United States
Location: San Antonio, TX
We are looking at way to stop/catch bad data at the point of it being entered. But if we put up a blocker, it can become frustrating at the time of entry. If we give the option to ignore the warning/error message, they can be overlooked and forgotten.

The Team is still looking for the best possible solution, because we want to see good-data just as much as you!
RobertV is offline   Reply With Quote
Old 09-04-2014, 10:01 AM   #5
Registered Member
 
Join Date: Nov 2007
Posts: 1,445
Country: United States
Location: north east PA
Perhaps allow the user to select using the middle 95% of vehicles for calculating the average fuel economy of a group of cars.
trollbait is offline   Reply With Quote
Old 09-04-2014, 10:46 AM   #6
Site Team
 
Join Date: Sep 2012
Posts: 317
Country: United States
Location: Dallas, Tx
We have been actively working on ways to identify outliers, mark them as potential errors and remove them from the crowdsourced results. I don't have an ETA for this as our developers are currently working on some projects that need to get done for our smartphone apps but hopefully in the next couple months we'll be able to implement a much more sophisticated tool to identify and ignore bad data.

Not only do we plan to use the data to identify outliers but also to understand what are "normal" results on a Year, Make, Model basis and then analyze data when it's being input into the system to identify issues before they are submitted. For example, this 2014 Mazda 6 probably entered their mileage in the gallon field and we should be able to identify that and fix it before the data is saved to our database.

Sometimes people forget to log a fuel-up and then on their next fuel-up the distance traveled is usually about twice that of a single fuel-up. Right now we require them to tell us that they skipped a fuel-up but we should be able to use the data we have to identify issues like this and help our users fix the issue.

We have been analyzing statistical methods to identify outliers, to determine what the max capacity of a fuel tank, the max distance per tank and other interesting tidbits. We are also gathering data on fuel prices and locations so we can determine if someone accidentally put the fuel price in the gallon field and the gallons in the fuel field even on fairly small fuel-ups.

Once we have developed these methods we will retroactively scan our entire database for issues and mark them accordingly. We will then highlight the fuel-ups on the logs so our drivers can see that they have a potential error.

In the meantime, you can paste a link to the vehicle with the error here in the forums and we can set that vehicle to be ignored.
andyrobo is offline   Reply With Quote
Old 09-04-2014, 12:54 PM   #7
Registered Member
 
Join Date: May 2009
Posts: 4
Country: United States
Here is the specific one I am referring to:

http://www.fuelly.com/car/mazda/6/2014/threewest/243719

I really think there should be some kind of automatic filtering if someone puts in 11 thousand gallons of fuel to throw that one out, at least in regards to the group average.

Honestly it might not be a bad idea to just throw out a certain top percent and lowest percent as far as the averages are concerned.
evoblade is offline   Reply With Quote
Old 09-04-2014, 01:39 PM   #8
Registered Member
 
Join Date: Jul 2014
Posts: 126
Country: Ireland
Location: Galway
You can't throw out the top ones, that would be me

I don't lie honest...

But seriously, you could remove the extremes to get the average.

One of the problems though is that there are so many categories of vehicles, that most of the averages are only based on a handful of cars, this is always going to allow the extremes to influence the average more than they should.

Oliver.
OliverGT is offline   Reply With Quote
Old 09-04-2014, 02:26 PM   #9
Site Team / Moderator
 
Jay2TheRescue's Avatar
 
Join Date: Sep 2008
Location: Northern Virginia
Posts: 4,657
Country: United States
Location: Northern Virginia
After looking at that fuel log, and comparing the erroneous entry with others for the same vehicle, it was obvious that the user forgot to enter the decimal point when entering their gallons. I've corrected the error. If you see any more just use the report page function. I try to keep on top of those so they don't get backlogged like they have in the past.
__________________




Jay2TheRescue is offline   Reply With Quote
Old 09-05-2014, 04:29 AM   #10
Registered Member
 
RobertV's Avatar
 
Join Date: Feb 2013
Posts: 1,900
Country: United States
Location: San Antonio, TX
Quote:
Originally Posted by Jay2TheRescue View Post
After looking at that fuel log, and comparing the erroneous entry with others for the same vehicle, it was obvious that the user forgot to enter the decimal point when entering their gallons. I've corrected the error. If you see any more just use the report page function. I try to keep on top of those so they don't get backlogged like they have in the past.
Thank you!
__________________

RobertV is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Powered by vBadvanced CMPS v3.2.3


All times are GMT -8. The time now is 12:45 PM.


Powered by vBulletin® Version 3.8.8 Beta 1
Copyright ©2000 - 2017, vBulletin Solutions, Inc.