Final report of ITS Center project: Transportation Data Clearinghouse
A Reseach Project Report
For the Center for ITS Implementation Reseach
A U.S. DOT University Transportation Center
TRANSPORTATION DATA CLEARINGHOUSE
Principal
Investigator: Dr. Aaron Schroeder
Kathy Laskowski
Virginia
Tech Transportation Institute
3500 Transportation Research Plaza (0536)
Blacksburg, VA 24061
Tel.: (540) 231-1505
Fax: (540) 231-1555
E-mail:
hrakha@vt.edu
July 2003
Disclaimer:
The contents of this
report reflect the views of the authors, who are responsible for the facts and
the accuracy of the information presented herein. This document is disseminated
under the sponsorship of the Department of Transportation, University Transportation
Centers Program, in the interest of information exchange. The U.S. Government
assumes no liability for the contents or use thereof.
In
late 1999, the Virginia Tech Transportation Institute (VTTI), at the request of
the Virginia Department of Transportation, undertook the process of examining
the feasibility of expanding Virginia’s Operational Database Management System
(ODMS) to include data from neighboring states along the I-81 Corridor. This expanded system was intended to become
a multi-state repository for “real time” traffic data that could be used for decision-quality
level information for travelers from Bristol, Virginia to Harrisburg, Pennsylvania.
This data would also be archived for later use as a planning tool for transportation
professionals.
To
achieve these goals, VTTI went through the process of designing an expanded ODMS
and of determining if the other states’ data could be used to fulfill the designs
requirements. The first task in the design
process was to perform interviews with traffic control professionals along the
corridor. The information regarding their
data source quality, format, connectivity, and other details was collected and
used to create the larger technical design plans.
The
information gained from these interviews provided the backbone to create the following
plans for the expansion of the database. The design plans included:
·
Conceptual System Model
·
Clearinghouse Functional
Requirements Plan
·
Clearinghouse Conceptual,
Logical and Physical Data Model Plan
·
Functional Design Plan
These
documents were then used to assess the different data sources according to technical
integrity, institutional barriers, timeliness of data, and overall data quality
to determine if the information was of a high enough quality to meet the needs
of an expanding ODMS.
A
series of interview sessions were conducted with representatives from Virginia,
West Virginia, Pennsylvania and Maryland Departments of Transportation, the State
and Highway Administration and Traffic Administration officials. Questions posed to interviewees were tailored
to their individual expertise in order to inventory specifics about operations
in each participating state.
Information
culled from these interviews ranged from data sharing procedures and processing
to the specifics of data collection equipment and the reports that the equipment
can provide. The final product of these
interviews resulted in a thorough inventory of the infrastructure and data processes
in place in each state.
The
Conceptual Architecture for the system (Figure 1) is based on a high level data
flow and control model across each of the participating states. The I-81 Corridor Conceptual System Model illustrates
the decomposition of the input (or source) terminators for the information flow
into and out of the system. The conceptual
architecture is illustrated as a single function to demonstrate the external inputs
later decomposed to address each state’s data sources.
The lowest level of decomposition will be the actual process that specifies
the data flow constructed from an input of data, processes or subsystems. Each of the four states is depicted in detail
with respect to their existing devices and traffic monitoring systems, road weather
conditions, and incident management. The model provided a foundation upon which to devise strategies
for disseminating information among prospective end-users through the data clearinghouse.
Figure 1. Conceptual Architecture.
All
of the states had a variety of automated traffic recorder devices that collected
traffic flow, counts and vehicle classification by type, information.
This data is aggregated and used at a later time for planning purposes.
Real-time information varied in quality and quantity
by state. This information is used by
DOTs to better manage the roadway, as in the case of road weather information
systems (RWIS), and may be relayed to
the traveling public through various means, including changeable message signs(CMS),
dynamic traffic alert signs(DTAS), and highway advisory radio (HAR).
The following table represents the existing data sources
and whether their format is electronic or paper. Regardless of format, all data arriving at
the clearinghouse would need to be processed by ITS operators and entered into
an interface before being made available to the public.
Table 2. Purposed Data Sources for the Data Clearinghouse
| State Content |
VA |
WVA |
MD |
PA |
| Incidents | State
Police - CADS Computer Aided Dispatch DOT
– VOIS VA Operations Information
System | County
911 Dispatch Faxes | State
Police Website | State
Police Faxes |
| Traffic Flow | No
available real-time data | No
available real-time data | No
available real-time data | No
available real-time data |
| Weather | VDOT
RWIS | - | - | - |
| Other | - | - | DTAS | VMS |
This
real time information was collected and validated via manual processes, meaning
that these systems are not automated and needed to be initiated by ITS operators.
Based on the information collected
during the interviews, a clearinghouse requirements plan was created. This plan provided the specification for a
functional infrastructure that is necessary to support the data clearinghouse
database. Processes involved in the infrastructure
pertain to the obstruction-free acquisition, transformation, and integration of
the data to be collected from the designated device types along I-81 across Virginia,
West Virginia, Maryland, and Pennsylvania. The specified infrastructure also includes
the network, hardware, and software functions needed to support these processes.
By conforming to these requirements, the database management system will
have the foundation through which travelers can be informed of traffic, weather,
and road conditions from place to place along the I-81 Corridor.
The data clearinghouse database
represents the primary point of storage for event data associated with traffic
incidents and weather and road conditions. Information
in the database will be accessed at the discretion of VTTI end-users through an
Online Analytical Processing (OLAP) application. In cases in which sources permit real-time
extraction, information stored in the database will be current and detailed in
terms of its status. The database design
will support the consolidation of the real-time Intelligent Transportation System
(ITS) data from designated information systems as well as data manually captured
by an ITS operator. Information stored
in the Data Clearinghouse will include material from sources such as the Weather
Management Systems (WMS) and Variable Message Signs (VMS), as well as emergency
operator logs, e-mails, and faxes. The design for the data clearinghouse will
provide VTTI with both a logical and a physical construct of the model.
The following is the functional
decomposition of the data clearinghouse system. The interfaces and their task are described in detail to provide
a hierarchical perspective of the chosen architecture.
Figure 2. System Architecture.
The
Data Acquisition Interface is used to collect data from data sources. The interface connects to and extracts data
from information systems, providing direct electronic transfer of data. The interface also facilitates operator-led
input for indirect data streams, such as faxes and e-mail. Furthermore, the Data Acquisition Interface
communicates with the Data Validation and Integration Interface to report communication
fault and failure. Within the architecture,
the Data Acquisition Interface is composed of one or more VTTI SQL Server engines,
such as Transact-SQL, distributed queries, and command-line applications.
The Data Validation and Integration
Interface assesses the integrity of the data received from the Data Acquisition
Interface. The interface applies established
transformation rules to the data types and values of the data and then integrates
the successfully validated data. The Data Validation and Integration Interface uses SQL Server Data
Transformation Services (DTS) Designer to perform the transformation tasks by
selecting the source data and mapping the data columns to a set of transformations.
The transformed data is then sent to its target database in the Data Clearinghouse.
The primary responsibility of
the I-81 Data Clearinghouse database is to allow for the collection, organization
and redistribution of traffic data. Collection
of the data from the various state agencies is contingent upon the agencies’ authorization
for VTTI to acquire the data. The information collected in the database is designed
to inform the public through traveler advisories available via the Internet (and
other types of channels) on an as-need basis.
The I-81 database design will be a combination of both relational and multidimensional
design, where feasible, to enable support for incoming and outgoing data transactions. This database design will support data access
and multiple-transaction processing.
The Data Distribution Interface
will be located on a SQL server, thereby enabling reports and outgoing data streams
to be generated through queries to the database.
Upon receiving a request, the Data Distribution Interface uses an access
interface to communicate with the database. It
then retrieves the pertinent data and presents it to the end-user.
The plan also provides a high
level review of system requirements for networking, hardware, and software to
work in conjunction with existing systems and plans for the further expansion
of the system.
Data modeling is a practical method
employed for identifying information to collect and manage in a database. It involves analyzing informational resources,
extracting the elements significant to the organization, and organizing those
elements into the design of a database structure that is efficient and effective
for information storage and retrieval. The conceptual, logical and physical data moles
present in this document represent the progressive steps performed in designing
the data clearinghouse database. In turn,
the database will act as the primary point of reference for informing travelers
of real-time traffic, weather, and road conditions along the corridor.
After the initial investigation
of the existing data sources employed in the participating states, the currency
of the data from those resources and the feasibility of gaining access to and
extracting data from those resources, the following guidelines were put into place
regarding the content and data sources of interest for the Data Clearinghouse.
Table 3. Actual Data Sources for the
Data Clearinghouse.
| State
/Content |
VA |
WVA |
MD |
PA |
|
Incidents | State
Police - CADS DOT – VOIS | none | CHART
Web Site | State
Police Fax Layout |
|
Traffic Flow | - | - | - | - |
|
Weather | VDOT
RWIS | - | CHART Web Site | none |
|
Other
- DTAS | - | - | DTAS |
|
|
Other
– VMS | - | - | - | None |
|
Construction | - | - | CHART
Web Site | PA
Turnpike Web site, Penn DOT District 8 work zone web site |
The conceptual data model for
the Data Clearinghouse (Figure 3) is the first step in the design of the target
database. The Data Clearinghouse conceptual
data model is an entity-relationship diagram that presents the candidate objects
of interest concerning travel-related data from the designated data sources.
An entity is the significant object
or idea about which information is stored in the available sources. An example of an entity is Incident, which can be described by such
characteristics as the location, time, and details of a traffic incident.
Relationships indicate any association
that may exist among these entities and that may represent information of relevance
to the Data Clearinghouse. An example
of a relationship in the conceptual model is the association between the vehicle
accident and the set of weather conditions existing at the time of the incident.
Figure 3. Conceptual Data Model for the
Data Clearinghouse.
The description of the relationship
contained in the conceptual data model are as follows:
Table 4. Relationships in the Conceptual
Data Model.
| Relationship |
Description |
|
Construction Lane Closure | The relationship wherein a construction
activity (as detailed in a construction advisory) causes closed lanes. Aspects of this relationship include, for
example, the effective period (start and expected end date) of the lane closure. |
|
Incident Lane Closure | The relationship wherein an
incident causes closed lanes. Aspects
of this relationship include, for example, the effective period of the lane closure
(when the lanes were closed and the expected time of reopening) |
|
Conditions Advisory | The relationship between a weather
advisory and the weather conditions manifest at that particular time. |
|
Conditions at Event | The relationship between an
incident and the weather conditions manifest at the time of the incident. |
The logical data model (Figure
4) is the next step towards finalizing the database design. It translates the entities and relations from the conceptual model
into a model that emphasizes the structuring of the data into a design for the
database, independent of the particular target platform (such as SQL Server for
Windows NT). Such structuring includes
the tables (or entities), applicable attributes with meaningful names, logical
data types (e.g., character, number and date), a particular schema type (such
as entity-relationship model or a dimensional model), and other such logical considerations.
The logical data model for the
Data Clearinghouse is designed to emphasize the three primary entities (Incidents,
Travel Advisories, and Weather Conditions) of the integrated clearinghouse, suppressing
unnecessary decomposition accommodating typical reporting on these entities.
An exception is weather- locations information, which is broken out of
the weather-conditions entity and formed into weather- locations ref in order
to take advantage of the relatively constant nature of weather-sensor locations
in contract with the attributes pertaining to weather conditions.
This division removes the redundancy of repeating the same location information
when weather conditions change and history is tracked in the database.
Incident
Location: String Incident
Type: String Last Updated: Datetime
First
Reported: Datetime Location
Description: String Incident
Details: String Region:
String Current
Status: String Lane
Closures: String Lane
Type: String Source:
String Effective
End Date:Datetime
Figure 4. Logical Model for the Data Clearinghouse.
Another objective accomplished
in the logical design is meaningful attribution of the entities. The entities in the model contain as many meaningful
attributes as the data sources can accommodate and as many of those attributes
that pertain to the needs of the database for informing the public on travel conditions.
For example, the Incident entity contains attributes that answer such questions
as what, when, where, and – to some extent – how and why with respect to the incidents. The question of who was involved in an incident
is confidential and, therefore, is not deemed relevant to the needs of the database.
Another important feature of the
logical design is the standardization of location referencing. Standardization is necessary for the identification
of exact location points and boundaries relative to a geographical referencing
system and relative to one another. Although location information from the sources
tends to be broad, it appears that enough detail is provided for at least one
location point at or near the boundary of an area to be identified. Therefore, the logical design imposes at least
one location point on the entities, requiring the ITS operator to isolate the
location point from source information. The
attribute “Principle Start Location” in the entity “Travel Advisory” is an example
of a location point imposed on the database.
The physical data model (Figure
5) is the final step in the database design, providing the specifications for
implementing the data structure on a particular database platform. In the case of the I-81 Data Clearinghouse,
the physical data model is designed for implementation on a SQL Server database
running on a Windows NT platform.
