CRAWDAD metadata: stanford/gates (v. 2003-10-16)

This dataset contains traces of the Stanford CS department's wireless network.
[xml metadata]

Note: This metadata was prepared by the CRAWDAD team and verified by the data set (or tool) authors. We have made every effort to ensure its accuracy, but urge all users to consider the metadata and data carefully and be sure that their use in research is consistent with the nature and limitations of the data. We welcome any corrections.


CRAWDAD metadata structure[what is CRAWDAD metadata]


[Dataset] stanford/gates (v. 2003-10-16)

top

version v. 2003-10-16
changes
The initial version
bibtex
@MISC{stanford-gates-2003-10-16,
  author = {Diane Tang and Mary Baker},
  title = {{CRAWDAD} data set stanford/gates (v. 2003-10-16)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/stanford/gates},
  month = oct,  
  year = 2003
}
					
metadata last modified2006-11-14
summary
This dataset contains traces of the Stanford CS department's wireless network.
release date2003-10-16
measurement start 1999-09-20
measurement end 1999-12-12
authorsDiane Tang
Mary Baker
web site http://www.crawdad.org/stanford/gates
wiki go to the wiki page for this data set
keywordpacket trace, SNMP, tcpdump, 802.11, authentication log
measurement purposesUsage Characterization
User Mobility Characterization
network type802.11 infrastructure
environment
We collected a 12-week trace of a local-area wireless network installed 
throughout the Gates Computer Science Building of Stanford University.

The building is L-shaped (the longer edge is called the a-wing, and the
shorter the b-wing). It has four main floors with offices and labs, a
basement with classrooms and labs, and a fifth floor with a lounge
and a few offices. Each of the main floors has two access points,
one for each wing. Additionally, the first floor has an access point
for a large conference room; the library, which spans both the
second and third floors, also has an access point. The basement
has two access points, one near the classrooms and one for the
Interactive Room, a special research project in the department.
The smaller fifth floor only has one access point.
The wireless user community consists of 74 users who can be
roughly divided into four groups:

- 35 first year PhD students, who were each given a laptop
with a WaveLAN card upon arrival (which corresponds to
the beginning of the trace). Their offices are primarily in the
2b wing.
- 22 graphics students and staff, the majority of whom
received laptops with WaveLAN cards a week into the
tracing period. Their offices are primarily in the 3b wing.
- Three robots, used by the robotics lab for research. The
robots do not have to authenticate themselves to reach the
outside network. While the robots are somewhat mobile, they
stay in the 1a wing. Although these WaveLAN cards are
intended to be used by the robots, students in the robotics lab
also use the network cards for session connections and websurfing.
- 14 other users (students, staff, and faculty) scattered
throughout the building.

In addition to these 74 users, there were also four users who
authenticated themselves but only connected to wired ports on the
public subnet rather than the wireless network. We do not
consider these users in the rest of this analysis of the wireless
network.
network
In the Gates Computer Science Building at Stanford University, 
administrators have made a "public" subnet available for any user 
affiliated with the university. Users desiring network access     
via this subnet must authenticate themselves to use their dynamically 
assigned IP address to access the rest of the departmental and 
university networks and the Internet.
This subnet is accessible both from a wireless network and from 
Ethernet ports in public places in the building, such as conference rooms,
lounges, the library, and labs. The wireless network is a
WaveLAN network with WavePoint II access points acting as
bridges between the wireless and wired networks. The access
points each have two slots for wireless network interfaces; both
slots are filled, one with older 2 Mbps cards to support the few
users who have not updated their hardware yet, and the other with
WaveLAN IEEE802.11-compatible 10 Mbps cards.

Because all of the wireless users are on a single subnet
(which promotes roaming without the need for Mobile IP or other
such support), we gathered traces on the router that connects 
the public subnet to the rest of the departmental
wired network. The router is a 90 MHz Pentium running RedHat
Linux with two 10 Mbps network interfaces. One interface
connects to the public subnet, and the other connects to the
departmental network.
collection
To gather all of the information we wanted, we collected
three separate types of traces during a 12-week period
encompassing the 1999 Fall quarter (from Monday, September 20
through Sunday, December 12). 

The first trace we gathered is a tcpdump trace of the link-level 
and network-level headers of all packets that went through the router. 
We use this information in conjunction with the other two traces.

The second trace is an SNMP trace. Approximately every
two minutes, the router queries, via Ethernet, all twelve access
points for the MAC addresses of the hosts currently using that
access point as a bridge to the wired network. Once we know
which access point a MAC address uses for network access, we
know the approximate location (floor and wing) of the device with
that MAC address. We pair these MAC addresses with the link
level addresses saved in the packet headers to determine the
approximate locations of the hosts in the tcpdump trace.
The overhead from the SNMP tracing is low: 530 packets or
50 KBytes is the average overhead from querying all twelve
access points every two minutes. The overhead for querying an
individual access point is 3.2 KBytes if no MAC addresses are
using that access point; otherwise, the base overhead is 14.5
KBytes for one user at an access point, plus 1 KByte for every
additional user.

The last trace is the authentication log, which keeps track of
which users request authentication to use the network. Each
request has both the user's login name as well as the MAC address
from which the user makes the request. We pair these MAC
addresses with the link-level addresses saved in the tcpdump trace
to determine which user sends out each packet.
sanitization
We obtained permission to collect these traces from the
Department Chair and informed all network users that this tracing
was taking place. We additionally informed users we would
record packet header information only (not the contents) and that
we would anonymize the data. Knowledge of the tracing may have
perturbed user behavior, but we have no way of quantifying the
effect.
tracesets included stanford/gates/combined (v. 2003-10-16)

[Traceset] stanford/gates/combined (v. 2003-10-16)

top

version v. 2003-10-16
changes
The initial version
bibtex
@MISC{stanford-gates-combined-2003-10-16,
  author = {Diane Tang and Mary Baker},
  title = {{CRAWDAD} trace set stanford/gates/combined (v. 2003-10-16)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/stanford/gates/combined},
  month = oct,  
  year = 2003
}
					
metadata last modified2006-11-14
summary
This traceset contains traces of the Stanford CS department's wireless network.
release date2003-10-16
measurement start 1999-09-20
measurement end 1999-12-12
measurement purposesUsage Characterization
User Mobility Characterization
methodology
We use the common timestamp and MAC address
information to combine three traces 
(tcpdump, SNMP, and authentication logs) into a single trace. 
The original three traces are not publicly available.
sanitization
We have anonymized the user and remote host names 
for privacy reasons.
parent datastanford/gates (v. 2003-10-16)
traces included stanford/gates/combined/anon (v. 2003-10-16)

[Trace] stanford/gates/combined/anon (v. 2003-10-16)

top

version v. 2003-10-16
changes
The initial version
bibtex
@MISC{stanford-gates-combined-anon-2003-10-16,
  author = {Diane Tang and Mary Baker},
  title = {{CRAWDAD} trace stanford/gates/combined/anon (v. 2003-10-16)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/stanford/gates/combined/anon},
  month = oct,  
  year = 2003
}
					
metadata last modified2006-11-14
summary
This trace contains traces of the Stanford CS department's wireless network.
derivedtrue
release date2003-10-16
measurement start 1999-09-20
measurement end 1999-12-12
format
[time] [pkt size] [username] [access point loc] [app] [dir] [remote host] 

dir is the direction -- incoming or outgoing or both 
(i.e., internal, or neither i.e., 
dhcp hadn't really gotten its act together yet). 
app will be a dotted port number (src/dst) 
if it's not recognized. 
time is at second granularity. 
pkt size is in bytes.
configuration
We use the common timestamp and MAC address
information to combine these three traces 	
(tcpdump, SNMP, and authentication logs) into a single 
trace with a total of 78,739,933 packets attributable 
to the 74 wireless users.
An additional 37,893,656 packets are attributable to 
the SNMP queries and 1,551,167 packets are attributable 
to the four wired users. The number of packets attributable 
to the SNMP queries might seem high, but each access point 
is queried every two minutes even if no laptops are 
actively generating traffic.
note
Note that because we do not record any signal strength
information, and since our access points generally cover 
a whole wing of a floor, we cannot necessarily detect 
movement within a wing but only movement between access points.
download urlDownload (121 MB tar.gz) from US UK
parent datastanford/gates/combined (v. 2003-10-16)

[Author] Diane Tang

top

emaildtang@cs.stanford.edu
institutionStanford University
departmentComputer Science Department
positionResearch Associate
web site http://graphics.stanford.edu/~dtang/
related data/toolsstanford/gates (v. 2003-10-16)

[Author] Mary Baker

top

emailmgbaker@hp.com
institutionHP Labs
web site http://www.hpl.hp.com/personal/Mary_Baker/
related data/toolsstanford/gates (v. 2003-10-16)