CRAWDAD metadata: ibm/watson (v. 2003-02-19)

This dataset includes SNMP records for a corporate research center (IBM Watson research center) over several weeks
[xml metadata]

Note: This metadata was prepared by the CRAWDAD team and verified by the data set (or tool) authors. We have made every effort to ensure its accuracy, but urge all users to consider the metadata and data carefully and be sure that their use in research is consistent with the nature and limitations of the data. We welcome any corrections.


CRAWDAD metadata structure[what is CRAWDAD metadata]


[Dataset] ibm/watson (v. 2003-02-19)

top

version v. 2003-02-19
changes
the initial version
bibtex
@MISC{ibm-watson-2003-02-19,
  author = {Magdalena Balazinska and Paul Castro},
  title = {{CRAWDAD} data set ibm/watson (v. 2003-02-19)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/ibm/watson},
  month = feb,  
  year = 2003
}
					
metadata last modified2006-11-09
summary
This dataset includes SNMP records for a corporate research center (IBM Watson research center) over several weeks
release date2003-02-19
measurement start 2002-07-20
measurement end 2002-08-18
authorsMagdalena Balazinska
Paul Castro
web site http://nms.lcs.mit.edu/~mbalazin/wireless/
wiki go to the wiki page for this data set
keywordSNMP, 802.11, 802.11b
measurement purposesUser Mobility Characterization
network type802.11 infrastructure
environment
The 802.11b wireless local-area network that we studied is spread throughout three 
large corporate buildings hosting computer science and electrical engineering research 
groups. The largest of the buildings, which we call LBldg, has 131 access points and is 
approximately 10 miles away from the other buildings. The other buildings, MBldg and 
SBldg, are adjacent to each other. They have 36 and 10 access points respectively. 
The placement of access points in buildings is based on geometry (one access points per 
corridor, for instance). Extra access points are placed in a few highly used rooms, 
such as a customer laboratory in SBldg.
network
The network is configured to run in infrastructure mode, in which wireless clients 
connect to the wired network through access points distributed in the environment. 
All 177 access points are Cisco Aironet 350s. We observed a total of 1366 unique MAC 
addresses. Laptops were by far the predominant devices on the network. We do not have 
information whether any other types of devices were used at all. We assume that each 
unique MAC address corresponds to a user, even though it is possible for a single user 
to have more than one MAC address or for users to trade cards with each other.
collection
We used SNMP to poll access points every 5 minutes, from Saturday, July 20th 2002 
through Sunday, August 17th 2002.
sanitization
Users were not informed that the study was performed. The only sensitive information 
that we gathered were the MAC and IP addresses of network cards, as well as the names 
assigned to access points. To ensure user privacy, we anonymized all three types of 
information.
tracesets included ibm/watson/snmp (v. 2003-02-19)

[Traceset] ibm/watson/snmp (v. 2003-02-19)

top

version v. 2003-02-19
changes
the initial version
bibtex
@MISC{ibm-watson-snmp-2003-02-19,
  author = {Magdalena Balazinska and Paul Castro},
  title = {{CRAWDAD} trace set ibm/watson/snmp (v. 2003-02-19)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/ibm/watson/snmp},
  month = feb,  
  year = 2003
}
					
metadata last modified2006-10-17
summary
This traceset includes SNMP records collected by polling APs every 5 minutes in a corporate research center (IBM Watson research center) over several weeks
release date2003-02-19
measurement start 2002-07-20
measurement end 2002-08-18
measurement purposesUser Mobility Characterization
methodology
We used SNMP to poll access points every 5 minutes, from Saturday, July 20th 2002 
through Sunday, August 17th 2002. We chose 5 min intervals to ensure that our study 
would not affect access point performance. We collected information about the traffic 
going through each access point as well as about the list of users associated 
with each access point. For each user, we retrieved detailed information on the amount 
of data (bytes and packets) transferred, the error rates, the latest signal strength, 
and the latest signal quality. We polled all access points except three located 
in MBldg that did not respond to SNMP requests.
sanitization
- Site names have been anonymized into LBldg, MBldg, and SBldg.
- Access point names have been anonymized by computing the SHA-1 hash of their name 
(concatenated with a secret) and pre-pending the anonymized site name to it.
- MAc addresses and IP addresses have been anonymized by computing the SHA-1 of 
their values (concatenated with a secret)
hole
Due to a power failure, there is a one-hour hole in the data 
(07/30/2002 from 1pm to 2pm). 
For unknown reasons, we also have a few holes in the data gathered at a few of 
the access points during the evening and night of 08/08/2002. 
Due to periods where access points were heavily loaded, some sample intervals 
stretch to 10 min.
note
The data in the following directories is organized by days.  There is one directory 
for every day of the trace. There are three files(traces) for each access point and 
for each day.  File names start with the name of the access point. The suffix of 
file names indicates the type of information it contains. On every poll of an 
access point, we appended data to each of these three files:
- (ibm_corporate-snmp-ap) File name ending with .snmp => table with data on the access point
- (ibm_corporate-snmp-interfaces) File name ending with -interfaces.snmp => 
table with data on the  access point's wireless interface
- (ibm_corporate-snmp-users) File name ending with -users.snmp => table with data on users
download urlDownload (108 MB tar.gz) from US UK AU
parent dataibm/watson (v. 2003-02-19)
traces included ibm/watson/snmp/ap (v. 2003-02-19)
ibm/watson/snmp/interfaces (v. 2003-02-19)
ibm/watson/snmp/users (v. 2003-02-19)

[Trace] ibm/watson/snmp/ap (v. 2003-02-19)

top

version v. 2003-02-19
changes
the initial version
bibtex
@MISC{ibm-watson-snmp-ap-2003-02-19,
  author = {Magdalena Balazinska and Paul Castro},
  title = {{CRAWDAD} trace ibm/watson/snmp/ap (v. 2003-02-19)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/ibm/watson/snmp/ap},
  month = feb,  
  year = 2003
}
					
metadata last modified2006-10-17
summary
This trace includes SNMP records about AP information such as number of inbound/outbound packets
derivedfalse
release date2003-02-19
measurement start 2002-07-20
measurement end 2002-08-18
format
Each trace consists of 14 fields as follows:
1. site        (string, anonymized)
2. day         (date )
3. moment      (time)
4. name        (string, anonymized)
5. sysUpTime   (time)
6. snmpInPkts  (int unsigned )
7. snmpOutPkts (int unsigned )
8. ipIn        (int unsigned )
9. ipOut       (int unsigned )
10. ipFwd     (int unsigned )
11. tcpIn       (int unsigned )
12. tcpOut      (int unsigned )
13. udpIn       (int unsigned )
14. udpOut      (int unsigned )

The following is the description of each field:
- site: the building where the access point was located
- day and moment: timestamp of the poll
- name: anonymized access point name

From the standard MIB-II (RFC1213): Management  Information Base for Network 
Management of TCP/IP-based Internets, we collected the following information 
for each access point:

- sysUpTime: The time (in hundredths of a second) since the network management 
portion of the system was last re-initialized.
- snmpInPkts: The total number of Messages delivered to the SNMP entity 
from the transport service.
- snmpOutPkts: The total number of SNMP Messages which were passed from the SNMP 
protocol entity to the transport service.
- ipInReceives: The total number of input datagrams received from interfaces, 
including those received in error.
- ipOutRequests: The total number of IP datagrams which local IP user-protocols 
(including ICMP) supplied to IP in requests for transmission.  Note that 
this counter does not include any datagrams counted in ipForwDatagrams.
- ipForwDatagrams: The number of input datagrams for which this entity was not 
their final IP destination, as a result of which an attempt was made to 
find a route to forward them to that final destination.  In entities which do 
not act as IP Gateways, this counter will include only those packets which 
were Source-Routed via this entity, and the Source- Route option processing 
was successful.
- tcpInSegs: The total number of segments received, including those received in error.
This count includes segments received on currently established connections.
- tcpOutSegs: The total number of segments sent, including those on current 
connections but excluding those containing only retransmitted octets.
- udpInDatagrams: The total number of UDP datagrams delivered to UDP users.
- udpOutDatagrams: The total number of UDP datagrams sent from this entity.
configuration
SNMP polling on each access point at every 5 minutes
parent dataibm/watson/snmp (v. 2003-02-19)

[Trace] ibm/watson/snmp/interfaces (v. 2003-02-19)

top

version v. 2003-02-19
changes
the initial version
bibtex
@MISC{ibm-watson-snmp-interfaces-2003-02-19,
  author = {Magdalena Balazinska and Paul Castro},
  title = {{CRAWDAD} trace ibm/watson/snmp/interfaces (v. 2003-02-19)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/ibm/watson/snmp/interfaces},
  month = feb,  
  year = 2003
}
					
metadata last modified2006-10-17
summary
This trace includes SNMP records about AP network interface such as bytes of inbound/outbound traffic, number of errors, and number of discarded packets
derivedfalse
release date2003-02-19
measurement start 2002-07-20
measurement end 2002-08-18
format
Each trace consists of 16 fields as follows:
1. site           (string, anonymized)
2. day            (date             )
3. moment      (time            )
4. name           (string           , anonymized)
5. ifIndex        (int              )
6. ifType         (string          )
7. ifSpeed        (int unsigned     )
8. ifPhysAddress  (string           , anonymized)
9. ifInOct        (int unsigned     )
10. ifInUcastPkts  (int unsigned )
11. ifInErrors     (int unsigned     )
12. ifInDiscards  (int unsigned     )
13. ifOutOct       (int unsigned     )
14. ifOutUcastPkts (int unsigned  )
15. ifOutErrors    (int unsigned     )
16. ifOutDiscards  (int unsigned    )
					
The following is the description of each field.
- site: building where access point is located
- day + moment : timestamp of poll
- name: name of access point
- ifIndex and ifType: interface index and type (to recognize the wireless interface)

From the standard MIB-II (RFC1213~\cite{rfc1213}: Management
Information Base for Network Management of TCP/IP-based Internets, we
collected the following information for each access point's wireless
interface:

- ifSpeed: An estimate of the interface's current bandwidth in bits per second.  
For interfaces which do not vary in bandwidth or for those where no accurate 
estimation can be made, this object should contain the nominal bandwidth.
- ifPhysAddress: The interface's address at the protocol layer immediately 
`below' the network layer in the protocol stack.  For interfaces which do not 
have such an address (e.g., a serial line), this object should contain an
octet string of zero length.
- ifInOctets: The total number of octets received on the interface, including 
framing characters.
- ifInUcastPkts: The number of subnetwork-unicast packets delivered to a 
higher-layer protocol.
- ifInErrors: The number of inbound packets that contained errors preventing them 
from being deliverable to a higher-layer protocol.
- ifInDiscards: The number of inbound packets which were chosen to be discarded 
even though no errors had been detected to prevent their being deliverable 
to a higher-layer protocol.  One possible reason for discarding such a packet 
could be to free up buffer space.
- ifOutOctets: The total number of octets transmitted out of the interface, 
including framing characters.
- ifOutUcastPkts: The total number of packets that higher-level protocols requested
be transmitted to a subnetwork-unicast address, including those that were 
discarded or not sent.
- ifOutErrors: The number of outbound packets that could not be transmitted 
because of errors.
- ifOutDiscards: The number of outbound packets which were chosen to be discarded 
even though no errors had been detected to prevent their being transmitted.  
One possible reason for discarding such a packet could be to free up buffer space.
configuration
SNMP polling on each access point's wireless interface at every 5 minutes
parent dataibm/watson/snmp (v. 2003-02-19)

[Trace] ibm/watson/snmp/users (v. 2003-02-19)

top

version v. 2003-02-19
changes
the initial version
bibtex
@MISC{ibm-watson-snmp-users-2003-02-19,
  author = {Magdalena Balazinska and Paul Castro},
  title = {{CRAWDAD} trace ibm/watson/snmp/users (v. 2003-02-19)}, 
  howpublished = {Downloaded from http://crawdad.cs.dartmouth.edu/ibm/watson/snmp/users},
  month = feb,  
  year = 2003
}
					
metadata last modified2006-10-17
summary
This trace includes SNMP records about network users such as number of packets and bytes from/to each user's machine.
derivedfalse
release date2003-02-19
measurement start 2002-07-20
measurement end 2002-08-18
format
Each trace contains the following 22 fields:
1. site           ( string, anonymized)
2. day            ( date)
3. moment         ( time )
4. parent         ( string, anonymized)
5. aid            ( int unsigned )
6. state          ( string, only "associated" users recorded)
7. shortRet       ( int unsigned )
8. longRet        ( int unsigned )
9. strength       ( int )
10. quality        ( int)
11. mac            ( string, anonymized)
12. classID       ( string, only "clientStations" recorded)
13. srcPkts        ( int unsigned )
14. srcOct         ( int unsigned )
15. srcErrPkts    ( int unsigned )
16. srcErrOct      ( int unsigned )
17. dstPkts        ( int unsigned )
18. dstOct         ( int unsigned )
19. dstErrPkts   ( int unsigned )
20. dstErrOct      ( int unsigned )
21. dstMaxRetryErr (int unsigned )
22. ip             ( string, anonymized)

- site: building where access point is located
- day + moment : timestamp of poll
- parent: name of access point

From the Cisco Aironet Access Point MIB (AWCVX-MIB.my) we collected
information about users:

- awcDot11TpFdbAID (aid): AID with which the Station is associated with this system, 
or 2008 if the Station is not currently known to be associated.  If the entry is 
multicast, awcDot11TpFdbAID is 0.  Note that the uplink from a Client or Repeater 
AP to its parent is always AID 1.
- awcDot11TpFdbClientState: 802.11 Service State of the Station. The state can be 
one of the following: state0 (station not able to send any frames whatsoever. 
It is most likely not yet configured), state1 (station can send Class 1 frames.  
It is Unauthenticated and Unassociated), state2 (Station can send Class 2 frames. 
It is Authenticated, but is as yet Unassociated), and state3 (Station can send 
Class 3 frames. It is both Authenticated and Associated).
- awcDot11TpFdbTxShortRetries: The total number of 802.11 Short Retries (RTS retries) 
incurred across all packet Transmission Attempts to this Station.
- awcDot11TpFdbTxLongRetries: The total number of 802.11 Long Retries (data retries) 
incurred across all packet Transmission Attempts to this Station.
- awcDot11TpFdbLatestRxSignalStrength: A device-dependent measure of the signal 
strength of the most recently received packet from this Station.  Might be normalized 
or unnormalized.
- awcDot11TpFdbLatestRxSignalQuality: A device-dependent measure of the signal 
quality of the most recently received packet from this Station.
- awcDot11TpFdbAddress (mac): MAC address
- awcTpFdbSrcPktsImmed: Number of observed packets for which this station was 
the source.
- awcTpFdbSrcOctetsImmed: Number of observed octets for which this station was 
the source.
- awcTpFdbSrcErrorPktsImmed: Number of observed error packets for which this station 
was the source.
- awcTpFdbSrcErrorOctetsImmed: Number of observed error octets for which this entry 
was the source.
- awcTpFdbDestPktsImmed: Number of observed packets for which this station was 
the destination.
- awcTpFdbDestOctetsImmed: Number of observed octets for which this station was 
the destination.
- awcTpFdbDestErrorPktsImmed: Number of observed error packets for which this station 
was the destination. This count includes awcTpFdbDestMaxRetryErrorsImmed.
- awcTpFdbDestErrorOctetsImmed: Number of observed error octets for which this station 
was the destination.
- awcTpFdbDestMaxRetryErrorsImmed: Number of observed max-retry error packets for 
which this station was the destination.
- awcTpFdbIPv4Addr: IPv4 network address of the station.
configuration
SNMP polling on each access point's associated users at every 5 minutes
parent dataibm/watson/snmp (v. 2003-02-19)

[Author] Magdalena Balazinska

top

emailmagda@cs.washington.edu
institutionUniversity of Washington
departmentDepartment of Computer Science and Engineering
positionAssistant Professor
web site http://www.cs.washington.edu/homes/magda/
related data/toolsibm/watson (v. 2003-02-19)

[Author] Paul Castro

top

emailpcastro@us.ibm.com
institutionIBM T.J. Watson Research Center, Hawthorne, NY 10532
departmentComputer Science
positionResearch Staff Member
related data/toolsibm/watson (v. 2003-02-19)

[Paper] balazinska-wireless

top

category inproceedings
authorsMagdalena Balazinska
Paul Castro
titleCharacterizing Mobility and Network Usage in a Corporate Wireless Local-Area Network
keywordsmeasurement
keywordswireless
keywordsibm_watson
keywordscrawdad
booktitleProceedings of the First International Conference on Mobile Systems, Applications, and Services (MobiSys)
pages303-316
month--05--
year2003
addressSan Francisco, CA
publisherUSENIX Association
download urlhttp://www.usenix.org/events/mobisys03/tech/balazinska.html
keyword
abstract
Wireless local-area networks are becoming increasingly popular. They are 
commonplace on university campuses and inside corporations, and they have 
started to appear in public areas. It is thus becoming increasingly important 
to understand user mobility patterns and network usage characteristics on 
wireless networks. Such an understanding would guide the design of applications 
geared toward mobile environments (e.g., pervasive computing applications), 
would help improve simulation tools by providing a more representative workload 
and better user mobility models, and could result in a more effective 
deployment of wireless network components. \par Several studies have recently 
been performed on wireless university campus networks and public networks. In 
this paper, we complement previous research by presenting results from a four 
week trace collected in a large corporate environment. We study user mobility 
patterns and introduce new metrics to model user mobility. We also analyze user 
and load distribution across access points. We compare our results with those 
from previous studies to extract and explain several network usage and mobility 
characteristics. <p> We find that average user transfer-rates follow a 
power law. Load is unevenly distributed across access points and is influenced 
more by which users are present than by the number of users. We model user 
mobility with persistence and prevalence . Persistence reflects session 
durations whereas prevalence reflects the frequency with which users visit 
various locations. We find that the probability distributions of both measures 
follow power laws.
related data/toolsibm/watson