Category | Public Safety |
Agency | City of Portland |
Sub-Agency | Police |
Date Released | March 5, 2010 |
Date Updated | March 5, 2010 |
Frequency | Daily |
Description |
This file reflects crimes reported to the City of Portland Police Bureau. Classification of the crime type is based on the Uniform Crime Reporting (UCR) system developed by the FBI and used by law enforcement agencies throughout the United States. Only the last 12 months of data will be available from the given date of download. |
Agency Program | City of Portland / Police |
Technical Documentation | Crime Incidents (metadata) |
Keywords |
|
Discussion
Great data! It would be nice to have some addition breakdown of the data, like: 1 - Unique case number so developers can identify each crime and update existing data when updates are made, rather than try to match cases across all other data fields 2 - X and Y GIS coordinates converted to two additional Latitude and Longitude coordinate fields 3 - Address field split into Address, City, State, and Zip fields 4 - URC Code number for the crime types in addition to the text descriptions
I second the need for a unique ID or case #. Right now duplicates are very hard to determine. I attempted to generate unique IDs for the crimes in this dataset and I came up with about 1000+ duplicates. The SHA was generated by combining the date, major offense, address and x and y coordinates. Those combined should be very unique. My concern was that the so called duplicates might not actually be duplicates. For instance, what if two people were charged with the same crime? I assume it's very common for one or more people to be involved in a crime. Does it generate two entries in this dataset if two people are involved in committing a single crime? My guess is that it doesn't (two people involved in one homicide is still one homicide). It's hard to tell without a unique ID. To expand on the point further, if you're automating the loading of this dataset on a daily basis it's very important that you can accurately tell which crimes you've already imported otherwise you end up with very inaccurate data which isn't very useful to the developer or public at large using an application built on this data. Finally, there is a misspelling in the dataset. Y Coordiante should actually be Y Coordinate. I contacted the email address listed in the metadata about a week ago, but haven't received a response. Thanks for releasing this data. It's a great first step!
Thank you both for the feedback. The spelling error to the column header has been corrected. What was “Y Coordiante” is now "Y Coordinate". Here are the potential impacts: * For anyone that has already built a working DTS\SSIS package, you will have to update the package as the column name no longer matches. * Any programming reference to the previous column name “Y Coordiante” in your application will have to change as well. With regard to the need for unique IDs, we will make that change in the coming days, and will announce when that is in effect as well.
@rnixon: Thanks for the quick response! It's very encouraging to know that you are listening to feedback and making the suggested improvements. With that in mind, it makes it very easy for me to get back to work on my project. Kudos to everyone involved in releasing this data! @yourmapper: You can easily convert the State Planar coordinates in this file to WGS 84 lat/lon coordinates using a project like GDAL. Specifically you're looking for gdaltransform - http://www.gdal.org/gdaltransform.html. Proj4js is also capable of making these conversions - http://proj4js.org.
We've now added a new field “Record ID” at the beginning. It’s an 8-digit unique record identifier for each offense/incident. It’s consistent in all the files out there for the same offense/incident. The changes have been applied to all files and metadata documentation.
I'm getting deeper into this data over at http://portlandcrime.com and I'd like to map neighborhood boundaries and gather statistics for crimes in different neighborhoods. There is another dataset[1] that has Neighborhood boundaries which is great. The problem is that the naming doesn't match up and there is no unique id assigned to a neighborhood so you could easily match the correct neighborhood regardless of how it was spelled. A few examples of how the datasets compare (Crime Incidents left, Neighborhoods right): POWELHST-GILBRT vs. POWELLHURST-GILBERT CHINA/OLD TOWN vs. OLD TOWN/CHINATOWN BEAUMONT-WILSHR vs. BEAUMONT-WILSHIRE BRENTWD-DARLNGT vs. BRENTWOOD-DARLINGTON I propose assigning neighborhoods a unique ID and including the identical unique id in both the Crime Incidents dataset and the Neighborhoods dataset. This would allow you to accurately map crimes to neighborhoods regardless of the spelling differences between the datasets. Another small request would be to stop putting everything in uppercase if it's possible. It's very easy to get MCDONALDS from McDonalds, but it's not easy to do the reverse without your program being aware of the rules of english language. Thanks a lot! 1: http://www.portlandonline.com/cgis/metadata/viewer/display.cfm?Meta_layer_id=52195&Db_type=sde&City_Only=False
It is great to see that civicapps has actually created a different sector for public related crime incidents. The data base is totally amazing with all the details embroidered within the topic. But the only case I didn't see is about tax problems related incidents and crimes committed by people. Which is actually literally growing day by day rather than other physical crime incidents. So, I think that civicapps should even think about the content related to tax problems and discuss over it.
Any idea when you can provide Lat Lon in addition to the state plane coordinates? By the way, the state plane zone is 10, but what's the coordinate system, NAD83?
I've made a (pubic domain) version of the current dataset where I have converted the HARN 83 state plane coordinates to WGS84 Latitude/Longitude. Feel free to download and use: http://www.opengeocode.org/cude/civicapps/civicapps.crime_incident_data.csv
I've one-upped you - combined your file and converted ALL 2005-2011 crime reports to longitude/latitude. The file is here https://skydrive.live.com/redir?resid=D76BB7E3B1B7C6F4%211280 (WARNING: Massive zip file, 132megs) This file contains 2005 to December 2013 crime incidents in Portland. Unfortunately I had to cut out incidents that could not be mapped (i.e., the ones where no X, Y coordinates existed), however one could combine my data set with a old dataset to have the complete picture.
I've one-upped your one-up. I downloaded the NOAA daily weather data for Portland for 2013 and added to the end of each record in the file the weather for that day - min/max temperature (Celsius), average wind speed (km/hr), precipitation (mm) and snowfall (mm). All readings are from the Portland International Airport weather station. Feel free to download and use: http://www.opengeocode.org/cude/civicapps/civicapps.crime_incident_data.csv
Thank you for your work converting the coordinates. I noticed that the converted dataset is only 19,275 rows (incidents) while the original was 60,245. Why is this? Thanks.
Request: I'd love to see the Incident Description included as a column. Would help to establish themes or analyze trends (for example, vandalism trends). Question: I found a report that I filed for an incident involving Theft from a Motor Vehicle, but a) it was listed as Larceny and not Motor Vehicle Theft, and b) the 8-digit unique ID did not match the 8-digit ID from my report. Not sure where those data pull from, so I'm not sure why they are inaccurate.
Thanks for your questions. 1) Theft from a Motor Vehicle – Law enforcement agencies in the U.S. use a classification system developed a number of years ago by the FBI. Theft from a motor vehicle is classified as a larceny, in this case an object or car part is taken from the vehicle. By contrast, a motor vehicle theft is when the vehicle itself is stolen. 2) Incident numbers – these numbers are unique ID numbers generated for the CivicApps application. These are not the case numbers on the police reports. Because the dataset can involve such crimes as domestic violence, assaults, harassment, etc. the decision was not to use the actual case numbers.
When will the dataset be updated? Right now it only goes to April 2015.
The dataset is updated once a year, usually in the spring, for the prior year statistics. For example, Crime Incidents for 2015 should be out in the spring of 2016. Thanks.
The "Police Incidents Last 6 hours" data map has not been working for several weeks. I can't seem to find the link on this site where questions can be asked about it. Our neighborhood learns of gunshots and other issues only a day or two after the incidents. We rely on the incidents site page to tell us whether we are in danger.
The feed has been down for weeks and all you hear are crickets.
***Correction toward the release of the 2015 crime data. The Police Bureau went to a new records management system in 2015. As part of this change incident based reporting (NIBRS) was adopted in place of the old system of UCR offense coding. Due to these changes, delays have been experienced in the record review process where offense codes used for statistical reporting are added to reports and other variable included in this dataset are verified. The 2015 data will be available once all records have been reviewed.