Historic archive of defunct list bismark-devel@lists.bufferbloat.net
 help / color / mirror / Atom feed
From: Dave Taht <d@taht.net>
To: bismark-devel@lists.bufferbloat.net, hndr@lists.pixilab.org,
	 abhishekjain95@gmail.com
Subject: [Bismark-devel] Dates are really tricky!!
Date: Mon, 18 Apr 2011 06:46:22 -0600	[thread overview]
Message-ID: <4DAC329E.3040603@taht.net> (raw)
In-Reply-To: <4DAC2C03.2060502@taht.net>

On 04/18/2011 06:18 AM, Dave Taht wrote:
> On 04/17/2011 11:57 PM, Nick Feamster wrote:
>
>> I think we can whip up some quick visualizations of what's in the DB 
>> with a nice Web page with some concerted python hacking.  For 
>> starters, it would be nice to have a version of the data whereby 
>> queries can actually complete in a reasonable time.  :-)  What's the 
>> best way to do this?  Split the table by months?
>>
> I did look over one of the slower (30+ seconds!) queries and suggested 
> that a between statement be used instead of an and for the date range, 
> but we never got around to testing that and mysql used to not be able 
> to do anything sane with a between range (postgres can).
>
> e.g:
>
> mysql>  select dstport, sum(uppkts) as up_p, sum(dwpkts) as down_p, 
> sum(upbytes) as up_bytes, sum(dwbytes) as down_bytes from 
> FLOWS_newformat where tsstart>unix_timestamp('2011-2-6') and 
> tsstart<unix_timestamp('2011-2-20') and deviceid='NB105' group by 
> dstport order by down_bytes desc limit 20;
>
> MIGHT do better with:
>
> select dstport, sum(uppkts) as up_p, sum(dwpkts) as down_p, 
> sum(upbytes) as up_bytes, sum(dwbytes) as down_bytes from 
> FLOWS_newformat where tsstart between unix_timestamp('2011-2-6') and 
> unix_timestamp('2011-2-20') and deviceid='NB105' group by dstport 
> order by down_bytes desc limit 20;
>
>
> I'd not be huge on divvying up tables by month, but creating summary 
> tables by day and by hour would be a good idea. Usage patterns tend to 
> be self-similar across day of week and by hour of day.
>
> I note that it is important to be able to capture hour of day AT the 
> location (EDT/MDT/PDT), as that is more self-similar that hour of day 
> (UTC)
>
To be clearer, what I suggest above is not solved by what I wrote below, 
and the below contained an error.

Coping with dates can be really tricky and should not be undertaken 
before my first cup of coffee.

http://www.postgresql.org/docs/8.2/static/datatype-datetime.html

> So the above becomes a nightly cron job in a specialized summary 
> table, something like (untested)
>
> drop table device_port_daily_summary;
> create table device_port_daily_summary as
>
select deviceid, dstport, day(tsstart) as test_day, EXTRACT(YEAR FROM 
TIMESTAMP tsstart) as test_year , sum(uppkts) as up_p, sum(dwpkts) as 
down_p, sum(upbytes) as up_bytes, sum(dwbytes) as down_bytes from 
FLOWS_newformat group by deviceid, dstport, test_day, test_year;

The way described below has a day-range of 0-365, which wasn't what you 
wanted. Days since the epoch or the start of database time would 
probably be better than explicitly storing the year... but I don't 
remember the good way of doing that.

Dates are tricky.

> select deviceid, dstport, day(tsstart) as test_day, sum(uppkts) as 
> up_p, sum(dwpkts) as down_p, sum(upbytes) as up_bytes, sum(dwbytes) as 
> down_bytes from FLOWS_newformat group by deviceid, dstport, test_day

would be wrong.


  reply	other threads:[~2011-04-18 12:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-18  5:57 [Bismark-devel] FCC app progress: django Nick Feamster
2011-04-18  7:57 ` [Bismark-devel] [Hndr] " Nick Feamster
2011-04-18 12:18 ` Dave Taht
2011-04-18 12:46   ` Dave Taht [this message]
2011-04-18 13:22     ` [Bismark-devel] [Hndr] Dates are really tricky!! Marshini Chetty
2011-04-18 14:11     ` Nick Feamster
2011-04-18 14:37       ` Srikanth Sundaresan
2011-04-18 14:39         ` Nick Feamster
2011-04-18 14:42           ` Dave Taht
2011-04-18 14:58           ` Beki Grinter
2011-04-18 15:03             ` Nick Feamster
2011-04-18 15:35               ` Marshini Chetty
2011-04-18 16:02                 ` Keith Edwards

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DAC329E.3040603@taht.net \
    --to=d@taht.net \
    --cc=abhishekjain95@gmail.com \
    --cc=bismark-devel@lists.bufferbloat.net \
    --cc=hndr@lists.pixilab.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox