From: Dave Taht <d@taht.net>
To: bismark-devel@lists.bufferbloat.net, hndr@lists.pixilab.org,
abhishekjain95@gmail.com
Subject: [Bismark-devel] Dates are really tricky!!
Date: Mon, 18 Apr 2011 06:46:22 -0600 [thread overview]
Message-ID: <4DAC329E.3040603@taht.net> (raw)
In-Reply-To: <4DAC2C03.2060502@taht.net>
On 04/18/2011 06:18 AM, Dave Taht wrote:
> On 04/17/2011 11:57 PM, Nick Feamster wrote:
>
>> I think we can whip up some quick visualizations of what's in the DB
>> with a nice Web page with some concerted python hacking. For
>> starters, it would be nice to have a version of the data whereby
>> queries can actually complete in a reasonable time. :-) What's the
>> best way to do this? Split the table by months?
>>
> I did look over one of the slower (30+ seconds!) queries and suggested
> that a between statement be used instead of an and for the date range,
> but we never got around to testing that and mysql used to not be able
> to do anything sane with a between range (postgres can).
>
> e.g:
>
> mysql> select dstport, sum(uppkts) as up_p, sum(dwpkts) as down_p,
> sum(upbytes) as up_bytes, sum(dwbytes) as down_bytes from
> FLOWS_newformat where tsstart>unix_timestamp('2011-2-6') and
> tsstart<unix_timestamp('2011-2-20') and deviceid='NB105' group by
> dstport order by down_bytes desc limit 20;
>
> MIGHT do better with:
>
> select dstport, sum(uppkts) as up_p, sum(dwpkts) as down_p,
> sum(upbytes) as up_bytes, sum(dwbytes) as down_bytes from
> FLOWS_newformat where tsstart between unix_timestamp('2011-2-6') and
> unix_timestamp('2011-2-20') and deviceid='NB105' group by dstport
> order by down_bytes desc limit 20;
>
>
> I'd not be huge on divvying up tables by month, but creating summary
> tables by day and by hour would be a good idea. Usage patterns tend to
> be self-similar across day of week and by hour of day.
>
> I note that it is important to be able to capture hour of day AT the
> location (EDT/MDT/PDT), as that is more self-similar that hour of day
> (UTC)
>
To be clearer, what I suggest above is not solved by what I wrote below,
and the below contained an error.
Coping with dates can be really tricky and should not be undertaken
before my first cup of coffee.
http://www.postgresql.org/docs/8.2/static/datatype-datetime.html
> So the above becomes a nightly cron job in a specialized summary
> table, something like (untested)
>
> drop table device_port_daily_summary;
> create table device_port_daily_summary as
>
select deviceid, dstport, day(tsstart) as test_day, EXTRACT(YEAR FROM
TIMESTAMP tsstart) as test_year , sum(uppkts) as up_p, sum(dwpkts) as
down_p, sum(upbytes) as up_bytes, sum(dwbytes) as down_bytes from
FLOWS_newformat group by deviceid, dstport, test_day, test_year;
The way described below has a day-range of 0-365, which wasn't what you
wanted. Days since the epoch or the start of database time would
probably be better than explicitly storing the year... but I don't
remember the good way of doing that.
Dates are tricky.
> select deviceid, dstport, day(tsstart) as test_day, sum(uppkts) as
> up_p, sum(dwpkts) as down_p, sum(upbytes) as up_bytes, sum(dwbytes) as
> down_bytes from FLOWS_newformat group by deviceid, dstport, test_day
would be wrong.
next prev parent reply other threads:[~2011-04-18 12:46 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-18 5:57 [Bismark-devel] FCC app progress: django Nick Feamster
2011-04-18 7:57 ` [Bismark-devel] [Hndr] " Nick Feamster
2011-04-18 12:18 ` Dave Taht
2011-04-18 12:46 ` Dave Taht [this message]
2011-04-18 13:22 ` [Bismark-devel] [Hndr] Dates are really tricky!! Marshini Chetty
2011-04-18 14:11 ` Nick Feamster
2011-04-18 14:37 ` Srikanth Sundaresan
2011-04-18 14:39 ` Nick Feamster
2011-04-18 14:42 ` Dave Taht
2011-04-18 14:58 ` Beki Grinter
2011-04-18 15:03 ` Nick Feamster
2011-04-18 15:35 ` Marshini Chetty
2011-04-18 16:02 ` Keith Edwards
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DAC329E.3040603@taht.net \
--to=d@taht.net \
--cc=abhishekjain95@gmail.com \
--cc=bismark-devel@lists.bufferbloat.net \
--cc=hndr@lists.pixilab.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox