Data Importer Package

Submodules

Managers

class seed.data_importer.managers.NotDeletedManager

Bases: django.db.models.manager.Manager

get_queryset(*args, **kwargs)

Models

class seed.data_importer.models.BuildingImportRecord(id, import_record, building_model_content_type, building_pk, was_in_database, is_missing_from_import)

Bases: django.db.models.base.Model

exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception BuildingImportRecord.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

BuildingImportRecord.building_model_content_type

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

BuildingImportRecord.building_record

Provide a generic many-to-one relation through the content_type and object_id fields.

This class also doubles as an accessor to the related object (similar to ForwardManyToOneDescriptor) by adding itself as a model attribute.

BuildingImportRecord.import_record

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

BuildingImportRecord.objects = <django.db.models.manager.Manager object>
class seed.data_importer.models.DataCoercionMapping(id, table_column_mapping, source_string, source_type, destination_value, destination_type, is_mapped, confidence, was_a_human_decision, valid_destination_value, active)

Bases: django.db.models.base.Model

exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception DataCoercionMapping.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

DataCoercionMapping.objects = <django.db.models.manager.Manager object>
DataCoercionMapping.save(*args, **kwargs)
DataCoercionMapping.source_string_sha
DataCoercionMapping.table_column_mapping

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

class seed.data_importer.models.ImportFile(id, created, modified, deleted, import_record, file, export_file, file_size_in_bytes, cached_first_row, cached_second_to_fifth_row, num_columns, num_rows, num_mapping_warnings, num_mapping_errors, mapping_error_messages, num_validation_errors, num_tasks_total, num_tasks_complete, num_coercion_errors, num_coercions_total, has_header_row, raw_save_done, raw_save_completion, mapping_done, mapping_completion, matching_done, matching_completion, source_type, source_program, source_program_version)

Bases: seed.data_importer.models.NotDeletableModel, django_extensions.db.models.TimeStampedModel

CLEANING_ACTIVE_CACHE_KEY
classmethod CLEANING_ACTIVE_CACHE_KEY_GENERATOR(pk)
CLEANING_PROGRESS_KEY
CLEANING_QUEUED_CACHE_KEY
classmethod CLEANING_QUEUED_CACHE_KEY_GENERATOR(pk)
exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

ImportFile.EXPORT_PCT_COMPLETE_CACHE_KEY
ImportFile.EXPORT_QUEUED_CACHE_KEY
ImportFile.EXPORT_READY_CACHE_KEY
exception ImportFile.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

ImportFile.QUEUED_TCM_DATA_KEY
ImportFile.QUEUED_TCM_SAVE_COUNTER_KEY
ImportFile.SAVE_COUNTER_CACHE_KEY
ImportFile.UPDATING_TCMS_KEY
ImportFile.buildingsnapshot_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

ImportFile.cache_first_rows()
ImportFile.cleaned_data_rows

Iterable of rows, made of iterable of column values of cleaned data

ImportFile.cleaning_progress_pct
ImportFile.coercion_mapping_active
ImportFile.coercion_mapping_queued
ImportFile.data_rows

Iterable of rows, made of iterable of column values of the raw data

ImportFile.default_manager = <seed.data_importer.managers.NotDeletedManager object>
ImportFile.export_generation_pct_complete
ImportFile.export_ready
ImportFile.export_url
ImportFile.filename_only
ImportFile.first_row_columns
ImportFile.force_restart_cleaning_url
ImportFile.from_portfolio_manager
ImportFile.generate_url
ImportFile.get_next_by_created(*moreargs, **morekwargs)
ImportFile.get_next_by_modified(*moreargs, **morekwargs)
ImportFile.get_previous_by_created(*moreargs, **morekwargs)
ImportFile.get_previous_by_modified(*moreargs, **morekwargs)
ImportFile.import_record

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

ImportFile.local_file
ImportFile.merge_progress_url
ImportFile.num_cells
ImportFile.num_cleaning_complete
ImportFile.num_cleaning_remaining
ImportFile.num_cleaning_total
ImportFile.num_failed_tablecolumnmappings
ImportFile.num_mapping_complete
ImportFile.num_mapping_remaining
ImportFile.num_mapping_total
ImportFile.objects = <seed.data_importer.managers.NotDeletedManager object>
ImportFile.premerge_progress_url
ImportFile.raw_objects = <django.db.models.manager.Manager object>
ImportFile.ready_to_import
ImportFile.save(in_validation=False, *args, **kwargs)
ImportFile.second_to_fifth_rows
ImportFile.tablecolumnmapping_formset(*args, **kwargs)
ImportFile.tablecolumnmapping_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

ImportFile.tablecolumnmappings
ImportFile.tablecolumnmappings_failed
ImportFile.tcm_errors_json
ImportFile.tcm_fields_to_save
ImportFile.tcm_json
ImportFile.update_tcms_from_save(json_data, save_counter)
class seed.data_importer.models.ImportRecord(id, deleted, name, app, owner, start_time, finish_time, created_at, updated_at, last_modified_by, notes, merge_analysis_done, merge_analysis_active, merge_analysis_queued, premerge_analysis_done, premerge_analysis_active, premerge_analysis_queued, matching_active, matching_done, is_imported_live, keep_missing_buildings, status, import_completed_at, merge_completed_at, mcm_version, super_organization)

Bases: seed.data_importer.models.NotDeletableModel

exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

ImportRecord.MAPPING_ACTIVE_KEY
ImportRecord.MAPPING_QUEUED_KEY
exception ImportRecord.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

classmethod ImportRecord.SUMMARY_ANALYSIS_ACTIVE_KEY(pk)
classmethod ImportRecord.SUMMARY_ANALYSIS_QUEUED_KEY(pk)
ImportRecord.add_files_url
ImportRecord.app_namespace
ImportRecord.buildingimportrecord_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

ImportRecord.dashboard_url
ImportRecord.default_manager = <seed.data_importer.managers.NotDeletedManager object>
ImportRecord.delete(*args, **kwargs)
ImportRecord.delete_url
ImportRecord.display_as_in_progress
ImportRecord.estimated_seconds_remaining
ImportRecord.files
ImportRecord.form
ImportRecord.get_status_display(*moreargs, **morekwargs)
ImportRecord.importfile_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

ImportRecord.is_mapping_or_cleaning
ImportRecord.is_not_in_progress
ImportRecord.last_modified_by

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

ImportRecord.mark_merge_started()

Marks the ImportRecord as having a merge in progress.

ImportRecord.mark_merged()

Marks the ImportRecord as having been processed (via merge_import_record())

ImportRecord.match_progress_key

Cache key used to track percentage completion for merge task.

ImportRecord.matched_buildings
ImportRecord.merge_progress_key

Cache key used to track percentage completion for merge task.

ImportRecord.merge_progress_url
ImportRecord.merge_seconds_remaining_key
ImportRecord.merge_status
ImportRecord.merge_status_key

Cache key used to set/get status messages for merge task.

ImportRecord.merge_url
ImportRecord.missing_buildings
ImportRecord.new_buildings
ImportRecord.num_buildings_imported_total
ImportRecord.num_coercion_errors
ImportRecord.num_columns
ImportRecord.num_failed_tablecolumnmappings
ImportRecord.num_files
ImportRecord.num_files_cleaned
ImportRecord.num_files_mapped
ImportRecord.num_files_merged
ImportRecord.num_files_to_clean
ImportRecord.num_files_to_map
ImportRecord.num_files_to_merge
ImportRecord.num_matched_buildings
ImportRecord.num_missing_buildings
ImportRecord.num_new_buildings
ImportRecord.num_not_ready_for_import
ImportRecord.num_ready_for_import
ImportRecord.num_rows
ImportRecord.num_validation_errors
ImportRecord.objects = <seed.data_importer.managers.NotDeletedManager object>
ImportRecord.owner

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

ImportRecord.pct_merge_complete
ImportRecord.pct_premerge_complete
ImportRecord.percent_files_cleaned
ImportRecord.percent_files_mapped
ImportRecord.percent_files_ready_to_merge
ImportRecord.percent_ready_for_import
ImportRecord.percent_ready_for_import_by_file_count
ImportRecord.pre_merge_url
ImportRecord.prefixed_pk(pk, max_len_before_prefix=32)

This is a total hack to support prefixing until source_facility_id is turned into a proper pk. Prefixes a given pk with the import_record

ImportRecord.premerge_estimated_seconds_remaining
ImportRecord.premerge_progress_key
ImportRecord.premerge_progress_url
ImportRecord.premerge_seconds_remaining_key
ImportRecord.raw_objects = <django.db.models.manager.Manager object>
ImportRecord.ready_for_import
ImportRecord.save_import_meta_url
ImportRecord.search_url
ImportRecord.start_merge_url
ImportRecord.status_denominator
ImportRecord.status_is_live
ImportRecord.status_numerator
ImportRecord.status_percent
ImportRecord.status_url
ImportRecord.summary_analysis_active
ImportRecord.summary_analysis_queued
ImportRecord.super_organization

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

ImportRecord.to_json
ImportRecord.total_correct_mappings
ImportRecord.total_file_size
ImportRecord.worksheet_progress_json
ImportRecord.worksheet_url
class seed.data_importer.models.NotDeletableModel(*args, **kwargs)

Bases: django.db.models.base.Model

class Meta
abstract = False
NotDeletableModel.delete(*args, **kwargs)
class seed.data_importer.models.RangeValidationRule(id, table_column_mapping, passes, validationrule_ptr, max_value, min_value, limit_min, limit_max)

Bases: seed.data_importer.models.ValidationRule

exception DoesNotExist

Bases: seed.data_importer.models.DoesNotExist

exception RangeValidationRule.MultipleObjectsReturned

Bases: seed.data_importer.models.MultipleObjectsReturned

RangeValidationRule.objects = <django.db.models.manager.Manager object>
RangeValidationRule.validationrule_ptr

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

class seed.data_importer.models.TableColumnMapping(id, app, source_string, import_file, destination_model, destination_field, order, confidence, ignored, was_a_human_decision, error_message_text, active)

Bases: django.db.models.base.Model

exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception TableColumnMapping.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

TableColumnMapping.combined_model_and_field
TableColumnMapping.datacoercion_errors
TableColumnMapping.datacoercionmapping_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

TableColumnMapping.datacoercions
TableColumnMapping.destination_django_field

commented out by AKL, not needed for SEED and removes dependency on libs.

TableColumnMapping.destination_django_field_choices
TableColumnMapping.destination_django_field_has_choices
TableColumnMapping.fields_to_save = ['pk', 'destination_model', 'destination_field', 'ignored']
TableColumnMapping.first_five_rows
TableColumnMapping.first_row
TableColumnMapping.friendly_destination_field
TableColumnMapping.friendly_destination_model
TableColumnMapping.friendly_destination_model_and_field
TableColumnMapping.import_file

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

TableColumnMapping.is_mapped
TableColumnMapping.objects = <django.db.models.manager.Manager object>
TableColumnMapping.save(*args, **kwargs)
TableColumnMapping.source_string_sha
TableColumnMapping.validation_rules
TableColumnMapping.validationrule_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

class seed.data_importer.models.ValidationOutlier(id, rule, value)

Bases: django.db.models.base.Model

exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception ValidationOutlier.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

ValidationOutlier.objects = <django.db.models.manager.Manager object>
ValidationOutlier.rule

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

class seed.data_importer.models.ValidationRule(id, table_column_mapping, passes)

Bases: django.db.models.base.Model

exception DoesNotExist

Bases: django.core.exceptions.ObjectDoesNotExist

exception ValidationRule.MultipleObjectsReturned

Bases: django.core.exceptions.MultipleObjectsReturned

ValidationRule.objects = <django.db.models.manager.Manager object>
ValidationRule.rangevalidationrule

Accessor to the related object on the reverse side of a one-to-one relation.

In the example:

class Restaurant(Model):
    place = OneToOneField(Place, related_name='restaurant')

place.restaurant is a ReverseOneToOneDescriptor instance.

ValidationRule.table_column_mapping

Accessor to the related object on the forward side of a many-to-one or one-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

child.parent is a ForwardManyToOneDescriptor instance.

ValidationRule.validationoutlier_set

Accessor to the related objects manager on the reverse side of a many-to-one relation.

In the example:

class Child(Model):
    parent = ForeignKey(Parent, related_name='children')

parent.children is a ReverseManyToOneDescriptor instance.

Most of the implementation is delegated to a dynamically defined manager class built by create_forward_many_to_many_manager() defined below.

seed.data_importer.models.queue_update_status_for_import_record(pk)

edited by AKL to trim down data_importer

seed.data_importer.models.update_status_from_dcm(sender, instance, **kwargs)
seed.data_importer.models.update_status_from_import_file(sender, instance, **kwargs)
seed.data_importer.models.update_status_from_import_record(sender, instance, **kwargs)
seed.data_importer.models.update_status_from_tcm(sender, instance, **kwargs)

URLs

Utils

class seed.data_importer.utils.CoercionRobot

Bases: object

lookup_hash(uncoerced_value, destination_model, destination_field)
make_key(value, model, field)
seed.data_importer.utils.acquire_lock(name, expiration=None)

Tries to acquire a lock from the cache. Also sets the lock’s value to the current time, allowing us to see how long it has been held.

Returns False if lock already belongs by another process.

seed.data_importer.utils.chunk_iterable(iter, chunk_size)

Breaks an iterable (e.g. list) into smaller chunks, returning a generator of the chunk.

seed.data_importer.utils.get_core_pk_column(table_column_mappings, primary_field)
seed.data_importer.utils.get_lock_time(name)

Examines a lock to see when it was acquired.

seed.data_importer.utils.release_lock(name)

Frees a lock.

Views

class seed.data_importer.views.DataImportBackend(**kwargs)

Bases: ajaxuploader.backends.local.LocalUploadBackend

Subclass of ajaxuploader’s LocalUploadBackend, to handle creation of ImportFile objects related to the specified ImportRecord.

upload_complete(request, filename, *args, **kwargs)

Called directly by fineuploader on upload completion.

seed.data_importer.views.get_upload_details(request, *args, **kwargs)

Retrieves details about how to upload files to this instance.

Returns:

If S3 mode:

{
    'upload_mode': 'S3',
    'upload_complete': A url to notify that upload is complete,
    'signature': The url to post file details to for auth to upload to S3.
}

If local file system mode:

{
    'upload_mode': 'filesystem',
    'upload_path': The url to POST files to (see local_uploader)
}
seed.data_importer.views.handle_s3_upload_complete(request, *args, **kwargs)

Notify the system that an upload to S3 has been completed. This is a necessary step after uploading to S3 or the SEED instance will not be aware the file exists.

Valid source_type values are found in seed.models.SEED_DATA_SOURCES

GET:

Expects the following in the query string:

key: The full path to the file, within the S3 bucket.

E.g. data_importer/buildings.csv

source_type: The source of the file.

E.g. ‘Assessed Raw’ or ‘Portfolio Raw’

source_program: Optional value from common.mapper.Programs source_version: e.g. “4.1”

import_record: The ID of the ImportRecord this file belongs to.

Returns:

{
    'success': True,
    'import_file_id': The ID of the newly-created ImportFile object.
}
seed.data_importer.views.sign_policy_document(request, *args, **kwargs)

Sign and return the policy document for a simple upload. http://aws.amazon.com/articles/1434/#signyours3postform

Payload:

{
 "expiration": ISO-encoded timestamp for when signature should expire,
               e.g. "2014-07-16T00:20:56.277Z",
 "conditions":
     [
         {"acl":"private"},
         {"bucket": The name of the bucket from get_upload_details},
         {"Content-Type":"text/csv"},
         {"success_action_status":"200"},
         {"key": filename of upload, prefixed with 'data_imports/',
                 suffixed with a unique timestamp.
                 e.g. 'data_imports/my_buildings.csv.1405469756'},
         {"x-amz-meta-category":"data_imports"},
         {"x-amz-meta-qqfilename": original filename}
     ]
}

Returns:

{
    "policy": A hash of the policy document. Using during upload to S3.
    "signature": A signature of the policy document.  Also used during upload to S3.
}

Module contents