Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest .gdb, .shp or .mdb as a GIS mapping file #232

Open
tejesri opened this issue Oct 10, 2024 · 2 comments
Open

Harvest .gdb, .shp or .mdb as a GIS mapping file #232

tejesri opened this issue Oct 10, 2024 · 2 comments

Comments

@tejesri
Copy link

tejesri commented Oct 10, 2024

Hi Team,

Geoportal Server has the capability to harvest data using a UNC path. However, we encountered an issue we harvest the .gdb, .mdb, and .shp files, Geoportal Server only harvests the file system only, But we wanted to harvest these datasets as GIS layers rather than just files. I’ve attached a snapshot below for your reference.
image
geoportal

Thanks & Regards,
Tejpal

@mhogeweg
Copy link
Member

mhogeweg commented Oct 10, 2024

hi, that is expected behavior. In order to harvest metadata from file geodatabase or personal geodatabases, you will need to write an ArcPy script that crawls folders. Something like this:

build list of folders (do not include file geodatabase folders to avoid what you experience above):

    workspaces = [x[0] for x in os.walk(start_dir) if not (x[0].endswith('.gdb'))]

    # crawl each of the folders as a workspace
    for workspace in workspaces:
        parse_workspace(workspace)

parse_workspace would be a function that does what you want with the datasets in the workspace. for example, loop over all datasets in the workspace, resulting in a metadata XML for the dataset:

    # loop over a list of ArcGIS compatible datasets
    # then loop over features in the dataset
    # then create a metadata for the feature
    # then publish metadata of the feature to the geoportal
    datasets = arcpy.ListFeatureClasses()
    for dataset in datasets:
        metadata = generate_metadata(workspace, dataset)

then you can publish to Geoportal Server through its REST API:

import requests
from requests.auth import HTTPBasicAuth

...

server = 'http://localhost:8080/geoportal/rest/metadata/item/'
auth = HTTPBasicAuth(username, password)
headers = {'Content-Type': 'application/json'}

def publish_metadata(metadata, item_id):
    the_url = server + item_id
    print(f"the_url - {the_url}")
    result = requests.put(url=the_url, data=json.dumps(metadata), auth=auth, headers=headers)
    print(f"{item_id} - {result.text}")

Here, 'the_url' is the URL to your geoportal's REST endpoint headers provides the username/password for your geoportal instance. Effectively Python does an HTTP PUT similar to the curl request below:

curl -X 'PUT' \
  'http://localhost:8080/geoportal/rest/metadata/item/abc123' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/xml' \
  -d '<?xml version="1.0" encoding="UTF-8"?>
<metadata>goes here</metadata>'

@tejesri
Copy link
Author

tejesri commented Oct 24, 2024

Hi Marten,

I tried to crawl / harvest the .gdb or .shp file using the python script, but getting the below error

"Workspaces to process: ['D:\shapefile']
Processing workspace: D:\shapefile
Feature classes found in D:\shapefile: ['lewis.shp']
Rasters found in D:\shapefile: []
Publishing to http://presalesgeoportal.esritech.in:8080/geoportal/rest/metadata/item/lewis.shp
HTTP error occurred for lewis.shp: 400 Client Error: for url: http://presalesgeoportal.esritech.in:8080/geoportal/rest/metadata/item/lewis.shp"

I have attached the python script (geoportalserver.py) here for your reference.

Could you please check and help me out to harvest the .gdb, .shp files.

Thanks & Regards,
Tejpal

geoportal_server.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants