For tests I've used 2.7MB .nmea file. It contains about 5 thousand GPRMC sentences from which about 3 thousand contain useful information.
The first version of readNMEA function takes about 0.87 seconds (on my Asus EeePC 1001P) to make an array of GPRMC class instances. How can we improve that?
Firstly, access to the elements of an array (by index) is faster than access to the elements of a hash (by key). Thus using old-style regular expressions decreases the time from 0.87 to 0.72 seconds:
@@pattern = %r{\$GPRMC,(\d{2})(\d{2})([\d.]+),(\w), (\d{2})([\d.]+),N,(\d{3})([\d.]+),E, ([\d.]+),([\d.]+),(\d{2})(\d{2})(\d{2}) }x def initialize(str) m = @@pattern.match(str) if not m then @valid = false else c = m.captures @valid = (c[3] == 'A') yy = c[12].to_i yyyy = yy < 83 ? 2000+yy : 1900+yy msec = (c[2].to_f - c[2].to_i) * 1000 @timestamp = Time.gm(yyyy, c[11].to_i, c[10].to_i, c[0].to_i, c[1].to_i, c[2].to_i, msec) @latitude = c[4].to_f + c[5].to_f / 60.0 @longitude = c[6].to_f + c[7].to_f / 60.0 @speed = @@KNOT * c[8].to_f @angle = c[9].to_f end end
The next noticeable issue is that readNMEA function creates GPRMC objects for the data which is not being used. So we should find a better way to check validity of a string. And such a way usually does exist, but that depends on the GPS receiver model. For example, my receiver yields every second GPGSA sentence as well as GPRMC one. That allows to rewrite readNMEA as follows:
def readNMEA(filename) data = [] File.open(filename) do |f| while lineRMC = f.gets do if lineRMC =~ /^\$GPRMC/ then while lineGSA = f.gets do if lineGSA =~ /^\$GPGSA/ then if not lineGSA =~ /^\$GPGSA,A,1/ then tmp = GPRMC.new lineRMC if tmp.valid then data << tmp end end break end end end end end return data end
After that the time reduced down to 0.57 seconds (3077 out of 5613 sentences were valid, so compare: 3077/5613 = 0.55 whereas 0.57/0.72 = 0.79. So this additional check will in theory slow down the process if more than 0.55/0.79 ≈ 70% of the data is valid.)
Nevertheless, I doubt that 0.57s is satisfactory for, say, using this script in web-applications via CGI. Since we hardly can improve regular expression performance, let's move in another direction: caching. We can save the data in a format better suited to read—so as to avoid regex matching. There's several possibilities. We can use YAML or SQLite, but Ruby provides a simpler approach. For example, there's an easy to use library called PStore (see example of use). But it does something useful in general but unnecessary in our case. First, it allows to store several objects in one file whereas we need to store only one. Second, it computes md5 hashes of files, and that makes PStore slower. So we'll use the module which this library relies upon—module Marshal from Ruby core. It's written in C so I doubt we can find anything faster.
The approach is as follows:
def readNMEA(filename) data = [] storagename = filename + '.dump' if File.exists? storagename then File.open(storagename) do |f| data = Marshal.load f.read end else File.open(filename) do |f| ... end File.open(storagename, "w+") do |f| f.write (Marshal.dump data) end end return data end
Now the time of reading & writing the dump for the first time takes 0.75s but afterwards reading directly from the dump takes just 0.18s. By the way, we may also delete the original .nmea file if we are short of disk-space: the dump file is about 5 times smaller in size. Although, the original file contains some other NMEA sentences which may be useful.
By the way, we can store not Time objects but just floats (by adding .to_f after Time constructor call). That makes program a little bit faster.
No comments:
Post a Comment