Geo localized patents by French residents registered in USPTO 1838-1960

(with Sergio Petralia, Ernest Miguelez and Rosina Moreno)

https://doi.org/10.7910/DVN/JYMTD6, Harvard Dataverse 

Description: This dataset contains all the patents that were granted to French residents by the USPTO over the period 1838-1960. The patents are geo-localized at the city (commune) level. The NUTS 3 level (Departments) is also included.

Technical Details: For each patent, the algorithm identifies the name of the commune where the patent is registered. This location is based on either the inventor's address or that of the assignee. The patent owner can be either the "Inventor" or the "Assignee," with the latter typically being a company. However, in many cases, the commune's name alone is insufficient to accurately determine the patent's location, as multiple communes may share the same or similar names. In such instances, the inclusion of the department name is necessary for precise identification. For patents registered in communes affected by this issue, we manually extract the department name from the patent documents. If the inventor's location cannot be determined due to multiple communes with the same name and a lack of departmental information, we consider the proximity of these communes to the assignee's location, if an assignee is listed, and report this as the location. In cases where a patent encounters all the challenges mentioned above and no assignee information is available, we search for the inventor's name in the Google Patents database. This allows us to potentially identify the inventor's location based on another patent document filed in the same or a nearby year. If no additional information is available to definitively determine the inventor's location, we default to reporting the commune's name provided by the geo-coding process. As a robustness check, we manually review the names of communes that appear only once or twice in the database, as these are more likely to be erroneous. We discovered that, in some instances, the algorithm mistakenly reports the inventor's surname as the name of a commune, particularly when the surname is "Franco," "France," or "François." Often, these patents are not even from France, so we exclude them from the analysis. The dataset contains only the patents that are in Continental France, also known as "Metropolitan France" (France métropolitaine), located on the European continent, including the mainland and the nearby islands, such as Corsica. It excludes France's overseas departments and territories, which are located outside Europe, such as those in the Caribbean, Indian Ocean, Pacific Ocean, and South America.