Surveying, Mapping and GIS

Exploring all aspects of mapping and geography, from field data collection, to mapping and analysis, to integration, applications development and enterprise architecture...

  • Geospatial Technology, End to End...

    Exploring all aspects of mapping and geography, from field data collection, to mapping and analysis, to integration, applications development, enterprise architecture and policy

EPA ARRA Mapper

Posted by Dave Smith On 12/02/2010 09:20:00 PM 15 comments

Since joining EPA I've been engaged in a wide variety of projects and efforts - one which we are currently getting out the door is an upgraded mapper for EPA projects funded by the American Recovery and Reinvestment Act (ARRA), otherwise known as Stimulus or the Recovery Act.

The major categories of projects receiving EPA funding via ARRA include Superfund Hazardous Waste Cleanup, Leaking Underground Storage Tanks, Clean Water State Revolving Fund (typically wastewater treatment), Drinking Water State Revolving Fund (potable water), National Clean Diesel Campaign, and Brownfields.

 The idea was to provide more granular data across the various programs where EPA has been getting ARRA funding to projects.


The mapper reports on a quarterly basis, in concert with the ARRA reporting requirements, and was built on the ESRI Flex API. As a quick overview, it shows statewide figures, as choropleth map, with summary tables:

Visitors can click on the menu to view awards by program category, or to view all awards, for example:


The pushpins indicating awards can then be selected, and info boxes will pop up with the details. In the example below, we asked the mapper to show "Chelsea, MA" and turned on Clean Diesel awards, and clicking on the map pin, we get the goods, two awards for Chelsea (note, spelling of "Collabrative" comes directly from the database):


This application should improve transparency, with the direct intent of showing tangible benefit to users in showing what's going on right at the community level. As for lessons learned, the technology was far less of a challenge than the learning curve of how government works, and navigating my way through various EPA offices and stakeholders and gaining their acceptance and participation. My many thanks to all those who helped out.

FEA, CPIC, LoB, and Strategic Alignment...

Posted by Dave Smith On 11/14/2010 08:59:00 AM 7 comments

Since joining a federal agency and becoming a national program manager of a federal IT investment, I have been navigating various things like Capital Planning and Investment Control (CPIC), Line of Business (LoB) initiatives, Federal Enterprise Architecture (FEA), and other pieces – there is certainly no shortage of process and documentation required throughout the lifecycle as systems are planned, designed, built, maintained and decommissioned. However, for all of its compliance documentation, metrics and matrices, I still think there are a number of core disconnects.

CPIC and LoB initiatives should inform on investments, and align investments. One area where I see inadequate trackability is in mission support. Each and every IT investment should be able to map to a matrix of mission drivers – such as Agency Strategic Plan elements, such as stated priorities and core initiatives within the agency, such as specific laws and mandates which the Agency is charged with carrying out. In turn, these mappings can be aggregated and examined for alignment. If for example, if the mission objective is to assess the impact of a specific activity on a population, then there is now opportunity to understand how many IT investments relate to that assessment, and one can then get any potentially disparate and disconnected activities aligned and harmonized, to leverage each, take advantage of opportunities to share things like data, models and infrastructure instead of having stovepiped activities where each party reinvents the wheel independently of the next.

Additionally, functional components should be mapped to as well. For example, data requirements, modeling requirements, geodata hosting and web services needs, and so on. These can help to inform on infrastructure investments – for example, being able to build a robust, shared, scalable environment with load balancing and fault tolerance instead of having a series of fragile, disconnected stovepipes with no scalability or fault tolerance. It can help toward paradigm shifts like leveraging cloud capacity and other types of things which can provide cost savings – which can then hopefully be driven toward innovation and new development, rather than more stovepipes and reinventing of the wheel.

As I have a background which straddles many disciplines, when I hear “Enterprise Architecture” it conceptually still goes back to old-school, bricks-and-mortar architecture, where a building is built from a blueprint.

Consider the enterprise as a house. You have several different rooms in it, serving different functions, yet for it to function effectively as a house, the work of the architect is to draft up the elements that bring it all together into a functional, cohesive whole. That means, the structural members to support the second floor as it is placed on the first floor, that means the stairs and corridors needed to connect the rooms, that means the placement of the rooms, to give them doors and windows to daylight where needed, the arrangement of them relative to each other and to the corridors and stairs, to ensure good flows where needed, the wiring to provide light, power and communications, the plumbing to bring water to and from where it’s needed, the HVAC systems to regulate heat and cold, and so on.

Yet, for all the talk in IT EA communities, most organizations largely still function as a series of disconnected, disjointed rooms. The EA effort should serve as the master blueprint. It needs to be informed by those who need each room, but in turn also needs to inform on how everything connects and how things flow from one room to the next, and where the wiring and plumbing is, and how to connect things and create meaningful flows, relationships and functionalities. For the developer, Enterprise Architecture should inform Solution Architecture, and where gaps are identified, that should in turn go back and inform Enterprise Architecture. The loops need to be closed. All of these things, FEA, CPIC, LoB and others, need to move beyond paperwork and compliance exercises, to becoming more robustly informed and cohesive, serving as the master blueprints and roadmaps.

In terms of metrics to gauge success, the best metrics would be those which demonstrate that alignment on mission, function and coordination have been achieved.

MyPropertyInfo

Posted by Dave Smith On 9/10/2010 10:54:00 AM 5 comments

As I have been ramping up on EPA's Facility Registry System over the last couple of months since coming on board with EPA, I have also had the opportunity to work on a number of other projects - one recent one that's rolled out is MyPropertyInfo.

The most truly fun thing about working in EPA's Office of Environmental Information is that they are involved in a lot of collaborative, cross-cutting efforts, so I get exposed to a lot of different things across the agency. As an example of this, in working with EPA's Freedom of Information Act (FOIA) officer Larry Gottesman and FOIA staff, they were pursuing an idea of greater accessibility toward reducing FOIA requests, such as in the case of common requests for data which actually is already being published by EPA, but which may be scattered across separate locations in the agency.

One example of this is MyPropertyInfo - http://epa.gov/myproperty/

Here, we sought to address frequently-asked questions about properties. This type of basic background and screening is highly useful and important to bankers, realtors, prospective buyers, developers and others who deal in real estate and properties - yet, to gather all of the relevant information about a property, one might have to visit multiple sites across EPA, or to submit a FOIA request and wait to have EPA gather the data from those disparate sources. So what we did in the case of MyPropertyInfo is quickly roll out a tool that basically just gathers that existing content in one place, and additionally provide it in printer-friendly form.

Thought it was essentially just screen-scraping (as we do not directly control some of the source reporting systems), it was nonetheless a quick and effective way of getting questions answered.  Moving foward, it again demonstrates also that using approaches that can provide easily integratable content like web services in addition to traditional HTML reports, content can be even more elegantly repurposed and reused in a variety of effective ways to answer business questions - with web services associated with the reporting engines, the widgets and iPhone apps for these types of applications will virtually build themselves.  For example, real estate sites like Zillow.com would also be able to dynamically pull environmental profile information about properties of interest to prospective buyers - hopefully a vision for the future at EPA.

Here is some additional perspective on MyPropertyInfo as posted to EPA's Greenversations blog by the FOIA office's Wendy Schumacher:  http://blog.epa.gov/blog/2010/08/30/my-property-info/

Traffic Congrestion, LBS and ITS

Posted by Dave Smith On 8/14/2010 12:33:00 PM 10 comments

For something along a different theme, given my extremely late (1:30AM) arrival back in Pennsylvania last night, Intelligent Transportation Systems (ITS) and Congestion Management.


Previously, I had been involved in a few projects involving Intelligent Transportation Systems - and yet, it still amazes me how far behind we are in terms of even basic approaches.  Last night, I was stuck in traffic due to an unfortunate accident ahead on the roadway.  My immediate observations:  A state trooper was sitting on the side of the road, with his mandate to alert drivers and monitor the end of the queue for problems.  However, where he was situated, as so often happens, was past the last exit available where motorists could get off the interstate and find an alternative.

Had he been situated ahead of that last exit, I and so many other motorists with onboard GPS could quite easily have hit our "DETOUR" buttons and navigated around the congestion rather than end up in the midst of it.  But instead, we end up stuck, and the congestion and queue only grows and grows.  Poor congestion management.

Secondly, a police cruiser sitting on the side of the road certainly alerts drivers of something.  However, it doesn't give any specificity whatsoever.  Perhaps, it was just that the trooper stopped a speeder, et cetera.  Often they are sitting well behind the congestion queue, and sometimes it's not immediately evident that there is congestion ahead.  Opportunity for informing motorists is lost, and the situation is not mitigated and managed as well as it could be.


It would seem to me that there are any number of relatively simple ways to address and mitigate congestion as a result of an accident or other similar traffic event - we certainly have ample technologies available.  For example, a portable variable message board that could rapidly be deployed by troopers (as in the photo), or other alerts.  There are numerous Variable Message Sign (VMS) boards along interstate corridors, yet amazingly, to this day, they still are largely uncoordinated, where messaging is not propagated along the corridor across district or state boundaries.  Highway officials still seem to not recognize that roadways are functionally networks, that internal administrative boundaries are not appropriate barriers as far as motorists are concerned.

It would also seem that protocols like CAP messaging, GeoRSS and others could and should be leveraged, and combined with very simple, wireless digital broadcast technologies, aligned to highway advisory radio beacon broadcasts, to provide simple, low-cost means of transmitting location-based information to in-dash receivers, GPS units and so on.  Certainly some such systems exist, however via subscription, or at additional cost to the price of the receivers, and so on - however, perhaps a better business model for such as broadcast system could operate via public-private partnerships, where operators of hotels, restaurants and amenities could fund the system by providing basic information about available attractions and amenities when there are no highway incidents.  A perfect case for Location-based service (LBS) technologies.  This does not have to be a costly, complicated thing.  We already have all of the ingredients and have had them for several years.

Locational Data Policy and Tools

Posted by Dave Smith On 8/07/2010 11:22:00 AM 2 comments

As I've posted previously, one of the newest hats I now wear is that I'm now the national program manager for EPA's Facility Registry System (FRS), where I am collecting and managing locational data for 2.9 million unique sites and facilities across states, tribes, and territories - I'm certainly excited about being able to contribute some good ideas toward enhancing its' capabilities, holdings, and collaborating and integrating with others across government.

A large part of this is data aggregated from other sources, such as data collected and maintained by state and tribal partners, EPA program offices and others, and then shared out via such means as EPA's Exchange Network. Historically, FRS has done what it can to improve data quality on the back end, by providing a locational record which aggregates up from the disparate underlying records, with layers of standardization, validation, verification and correction algorithms, as well as working with a national network of data stewards. This has iteratively resulted in vast improvements to the data, correcting common issues such as reversed latitude and longitude values, omitted signs in longitudes, partial or erroneous address fields and so on.

However, it still remains that there remain some issues with the data, with the weakness being in how data is collected, imposing limits on what kinds of backend correction can be performed. In most cases, data is captured via basic text fields. The further upstream that the data can be vetted and validated, the better, in particular, right at the point of capture, for example instances where facility operators themselves enter the data.

So, here is the notion - a toolbox of plug-and-play web services and reusable code to replace the basic free-text field, which allows real-time parsing and verification of data being entered. Part of that may involve using licensed commercial APIs to help with address verification and disambiguation, for example, the Bing Maps capability to deal with an incomplete address or one with a typo, such as "1200 Contitutoin, Wash DC" the web services would try to match these and return "Did you mean 1200 Constitution Avenue NW, Washington, DC?"


Between suggesting an alternative which attempts to correct partial and/or incorrect addresses, and providing an aerial photo as a visual cue for verification, it improves the likeihood that the user is going to doublecheck their entry and either accept the suggested alternative or type in a corrected address, along with having the visual verification in the aerial photo. Notionally, if the aerial photo view shows a big open field where there should be a large plastics plant, they would stop and wonder, and perhaps doublecheck the address they had entered.

That's certainly a good first step, and is something I'm currently looking to promote on the short term. In talking to some of my EPA stakeholders, they are very interested in this, and I will look at developing some easy-to-integrate code for them to use.

But, to think more long-range, let's take that further - from the large universe of facilities that I deal with, not all things populating "address" fields are conventional street addresses. For example, remote mining activities in western states might instead be represented on the PLSS system, such as "Sec 17, Twp 6W Range 11N", or rural areas might simply use "Mile 7.5 on FM 1325 S of Anytown" or "PR 155 13.5km west of Anytown".

Again, perhaps there are ways to improve this, a longer-term discussion, but certainly the ingredients exist. A first step might be to look at developing guidance on consistent ways to have folks enter this type of data, for example "Sec 17, Twp 6W Range 11N" versus S17-T6W-R11N, along with developing parsers that can understand and standardize the possible permutations that might be entered, including entry of section and meridian info, e.g. NW1/4 SW1/4 SE1/4 SEC 22 T2S R3E MDM for an entry that includes drilldown into quarter sections to identify a 10-acre parcel, also referencing the Mount Diablo Meridian.


Currently, there isn't any truly standardized way of entering and managing these, but perhaps there is a role in the surveying community toward standardized nomenclature to assist in database searching and indexing.  Coincident with this is potential collaborative development of ways to approach parsing and interpreting nonstandardized entries, along with leveraging existing PLSS data and  geocoders built toward translating these into locational extents, such as a bounding box, along with provisioning it with appropriate record-level metadata describing elements such as method of derivation and accuracy estimate.

In concert with this, obviously, should be an effort toward providing linkages to actual field survey polygonal data, as appropriate if it's a parcel-oriented effort (for example, for superfund site cleanup and brownfields), and where such data is available.

Similarly, one could collaboratively develop guidance and parsers for dealing with the route-oriented elements, for example "Mile 7.5 on FM 1325 S of Anytown" or "PR 155 13.5km west of Anytown" toward standardizing these types of fields as well - for example, whether or not to disambiguate or expand FM as "Farm to Market Route", what order to place elements consistently.

And then, one would want to leverage routing software to measure the distance along the given route from the given POI, toward providing a roughly-geocoded locational value to get in the ballpark.  And again, one would want a web service that does this to return any appropriate metadata on source, error and so on.

PLSS locations, mileage-along-a-route locations, and things like this are just a sampling of the universe of possibilities.  And as I point out, there are bits and pieces of tools that can do some of these things, but they are currently scattered and uncoordinated, and community-oriented, collaborative efforts can help to pull some of these together.

Atop these, as mentioned above one could also provide additional pieces, such as tools for visual verification, at the most basic level, or, if collection mandates permit, tools to allow the user to drop a pushpin on an aerial photo feature, drag a bounding box, or digitize a rough boundary - (and most ultimately of course, a means of entering and/or uploading survey data for field-located monumentation points, boundary topology, and record description data).

From a federal perspective, EPA is certainly not the only agency that needs some of these types of tools, and EPA is certainly not the only agency that needs internal policy and/or best practices guidance on how to deal with how these types of values are best represented in databases.  It would make sense, from an Enterprise Architecture perspective, for the federal community to collaborate, along with state, tribal and local governments.  Similarly, I would think that there are a lot of non-profits, academia and private sector entities that have a big stake in locational data improvement that could benefit from improved data that would be facilitated by such tools, along with benefitting from such tools for data collection themselves.

For my part, I will try to do what I can toward leading the charge on these, and to leverage any existing efforts already out there.  Additionally, given the capabilities that FRS has, I am looking to continue to integrate across internal stakeholders as well as external agencies toward being able to aggregate, link and reshare, with a process where data is iteratively improved and upgraded collectively.

I'd certainly be interested in getting thoughts, ideas and perspectives from others on this.

Search