The OPARA service was recently upgraded to a new technical platform. You are visiting the outdated OPARA website. Please use https://opara.zih.tu-dresden.de/ for new data submissions. Previously stored data will be migrated in near future and then the old version of OPARA will finally be shut down. Existing DOIs for data publications remain valid.

Zur Kurzanzeige

Subtitle: Sunset and Sunrise grid aggregation & chi

Metadaten

Ergänzende TitelSubtitle: Sunset and Sunrise grid aggregation & chi
Weitere mitwirkende Personen, Institutionen oder Unternehmendfg - Funder
Weitere mitwirkende Personen, Institutionen oder UnternehmenBurghardt, Dirk (orcid: 0000-0003-2949-4887) - ProjectLeader
Für den Inhalt der Forschungsdaten verantwortliche Person(en)Dunkel, Alexander (ORCID: 0000-0003-1157-7967)
Für den Inhalt der Forschungsdaten verantwortliche Person(en)Burghardt, Dirk (ORCID: 0000-0003-2949-4887)
Für den Inhalt der Forschungsdaten verantwortliche Person(en)Hartmann, Maximilian
Für den Inhalt der Forschungsdaten verantwortliche Person(en)Ross, Purves
Für den Inhalt der Forschungsdaten verantwortliche Person(en)Eva, Hauthal (ORCID: 0000-0001-8917-600X)
Beschreibung der weiteren DatenverarbeitungThroughout the project, we used the HLL functions union, intersection and cardinality estimation to generate our results shared in the publication. The initial data is reduced to a coarser ‘data collection granularity’ based on the HLL union function, which is sufficient for worldwide analysis. For coordinates, this means that we ‘snap’ points to a grid using a GeoHash of 5 (see [2]), referring to an average aggregation distance of about four kilometers. Similarly, to explore temporal distributions, dates are grouped to distinct months and years. Distinct terms are selected from the post body, the post title and tags or hashtags , and used to explore associated semantics (what). From this initial data collection, measures are stepwise aggregated (1) to a 100x100 km grid, (2) country, and (3) worldwide levels. We chose a 100 km resolution as a balance for the worldwide analysis, after testing with both 50 km and 200 km. Notebooks (S1–S9) allow for exploration of results for arbitrary resolutions and extents. The count of unique elements (i.e. the estimated number of users) are used for visualizing relationships. We chose to use the signed chi value to capture over and under representation of sunset and sunrise, with respect to the overall use of social media, rather than visualizing absolute counts [47–48,49 p156]. We use a spatial formulation of signed chi values as proposed in an exploratory analysis of social media by Clarke, Wood, Dykes & Slingsby [3]. Finally, we explore semantic patterns based on ranked terms for each country using term-frequency inverse document-frequency (TF-IDF) and binary cosine similarity to compare semantics between countries. [2]: Ruppel P, Küpper A. Geocookie: A space-efficient representation of geographic location sets. J Inf Process. 2014;22: 418–424. doi:10.2197/ipsjjip.22.418 [3]: Clarke K, Wood J, Dykes J, Slingsby A. Interactive Visual Exploration of a Large Spatio-Temporal Dataset: Reflections on a Geovisualization Mashup. IEEE Trans Vis Comput Graph. 2007;13: 1176–1183.
Art der Erhebung der DatenOther: Data query from public Application Programming Interfaces (APIs).
Verwendete ForschungsinstrumenteCarto-Lab Docker v0.9.0 (https://gitlab.vgiscience.de/lbsn/tools/jupyterlab)
Zugrundeliegende ForschungsobjekteOther: Public online reactions to the sunset and sunrise.
KurzbeschreibungEvents profoundly influence human-environment interactions. Through repetition, some events manifest and amplify collective behavioral traits, which significantly affects landscapes and their use, meaning, and value. However, the majority of research on reaction to events focuses on case studies, based on spatial subsets of data. This makes it difficult to put observations into context and to isolate sources of noise or bias found in data. As a result, inclusion of perceived aesthetic values, for example, in cultural ecosystem services, as a means to protect and develop landscapes, remains problematic. In this work, we focus on human behavior worldwide by exploring global reactions to sunset and sunrise using two datasets collected from Instagram and Flickr. By focusing on the consistency and reproducibility of results across these datasets, our goal is to contribute to the development of more robust methods for identifying landscape preference using geo-social media data, while also exploring motivations for photographing these particular events. Based on a four facet context model, reactions to sunset and sunrise are explored for Where, Who, What, and When. We further compare reactions across different groups, with the aim of quantifying differences in behavior and information spread. Our results suggest that a balanced assessment of landscape preference across different regions and datasets is possible, which strengthens representativity and exploring the How and Why in particular event contexts. The process of analysis is fully documented, allowing transparent replication and adoption to other events or datasets. The data encompasses both code (jupyter notebooks) and data (abstracted using hyperloglog). Please see the git repository for any further information: https://gitlab.vgiscience.de/ad/sunset-sunrise-paper
Angewendete Methoden oder VerfahrenWe use a workflow based on HyperLogLog (HLL) that was first demonstrated by Dunkel et al. [1], studying user frequency of worldwide Flickr posts and quantifying the effects on privacy. HLL allowed us to reduce the data collection footprint to quantitative measurements early in the process. Consequently, the study illustrated here can be repeated without the need to store raw data, providing both performance and privacy benefits [1]. All quantities available through this data repository and reported in the paper are estimates, with guaranteed error bounds of ±2.30% [1]. [1]: Dunkel A, Löchner M, Burghardt D. Privacy-aware visualization of volunteered geographic information (VGI) to analyze spatial activity: a benchmark implementation. ISPRS Int J Geo-Information. 2020;9. doi:10.3390/ijgi9100607
Weitere erklärende Angaben zu den DatenThe workflow to load and process the data provided is available in Jupyter Notebooks or the respective HTML conversions of notebooks.
InhaltsverzeichnisS10 Dataset (S10.zip): Anonymized data (CSV File with HLL sets) to reproduce results using the code in Jupyter Notebooks (S1-S9). flickr_sunrise_hll.csv Observed Frequencies Usercount/Sunrise for Flickr (HLL) (2.83 MB) flickr_sunset_hll.csv Observed Frequencies Usercount/Sunset for Flickr (HLL) (2.91 MB) flickr_all_hll.csv Expected Frequencies Usercount for Flickr (HLL) (19.69 MB) instagram_sunrise_hll.csv Observed Frequencies Usercount/Sunrise for Instagram (HLL) (4.14 MB) instagram_sunset_hll.csv Observed Frequencies Usercount/Sunset for Instagram (HLL) (9.10 MB) instagram_random_hll.csv Expected Frequencies Usercount for Instagram (HLL) (19.51 MB) 2020-04-07_Flickr_Sunrise_World_CCBy.csv Flickr geotagged Creative Commons Sample Photos (Metadata) for Sunrise (7.31 MB) 2020-04-07_Flickr_Sunset_World_CCBy.csv Flickr geotagged Creative Commons Sample Photos (Metadata) for Sunset (25.0 MB) 20210202_FLICKR_SUNSET_random_­country_tf_idf.csv TF-IDF Scores for Flickr Sunset (59 KB) 20211029_FLICKR_SUNSET_random_country­_cosine_similarity_binary.csv Binary Cosine Similarity for Flickr Sunset (1.0 MB) flickr_sunset_terms_country.csv Flickr Sunset User Terms grouped by distinct Country (su_a3 Code) (151 MB) flickr-sunrise-months.csv Flickr Sunrise (postcount) HLL data for each month. (0.36 MB) flickr-sunset-months.csv Flickr Sunset (postcount) HLL data for each month. (0.36 MB) flickr-terms.csv Flickr Postcount (HLL) per search term (46.8 KB) instagram-terms.csv Instagram Postcount (HLL) per search term (49.6 KB) flickr-all.csv Flickr total Postcount (HLL) (2.54 KB) This repository contains a series of nine notebooks (release_v1.0.0.zip): S1: the grid aggregation notebook (01_gridagg.ipynb) is used to aggregate data from HLL sets at GeoHash 5 to a 100x100km grid S2: the visualization notebook (02_visualization.ipynb) is used to create interactive maps, with additional information shown on hover S3: the chimaps notebook (03_chimaps.ipynb) shows how to compute the chi square test per bin and event (sunset/sunrise). S4: the results notebook (04_combine.ipynb) shows how to combine results from sunset/sunrise into a single interactive map. S5-S9: Notebooks 5 to 9 are used for creating additional graphics and statistics. S1 Jupyter Notebook: 01_grid_agg.html S2 Jupyter Notebook: 02_visualization.html S3 Jupyter Notebook: 03_chimaps.html S4 Jupyter Notebook: 04_combine.html S5 Jupyter Notebook: 05_countries.html S6 Jupyter Notebook: 06_semantics.html S7 Jupyter Notebook: 07_time.html S8 Jupyter Notebook: 08_relationships.html S9 Jupyter Notebook: 09_statistics.html
Weitere Schlagwörterdata, hyperloglog, repository, code, jupyter
Spracheeng
Entstehungsjahr oder Entstehungszeitraum2017-2022
Veröffentlichungsjahr2023
HerausgeberTechnische Universität Dresden
Referenzen auf ergänzende MaterialienIsPartOf: 123456789/5791 (Handle)
Referenzen auf ergänzende MaterialienReferences: doi:10.3390/ijgi9100607 (DOI)
Referenzen auf ergänzende MaterialienReferences: doi:10.2197/ipsjjip.22.418 (DOI)
Referenzen auf ergänzende MaterialienReferences: https://gitlab.vgiscience.de/lbsn/tools/jupyterlab (URL)
Referenzen auf ergänzende MaterialienReferences: https://gitlab.vgiscience.de/ad/sunset-sunrise-paper (URL)
Inhalt der ForschungsdatenDataset, Software, Workflow: The data contains generalized information on people's public responses to the sunset and sunrise from Social Media. On data query time, the data has been statistically abstracted using the Probabilistic Data Structure (PDS) HyperLogLog (HLL). HLL estimates the number of distinct items in a set by an irreversible approximation, preventing identification of individual users from collected data and significantly improving data processing performance.
Inhaber der NutzungsrechteTechnische Universität Dresden
Nutzungsrechte des DatensatzesCC-BY-NC-4.0
Eingesetzte SoftwareResource Processing: Python 3.7
Eingesetzte SoftwareResource Processing: xarray 0.16.1
Eingesetzte SoftwareResource Processing: Shapely 1.7.1
Eingesetzte SoftwareResource Processing: pyproj 2.6.1.post1
Eingesetzte SoftwareResource Processing: pandas 1.1.2
Eingesetzte SoftwareResource Processing: numpy 1.19.1
Eingesetzte SoftwareResource Processing: matplotlib 3.3.2
Eingesetzte SoftwareResource Processing: mapclassify 2.3.0
Eingesetzte SoftwareResource Processing: ipython 7.18.1
Eingesetzte SoftwareResource Processing: holoviews 1.13.4
Eingesetzte SoftwareResource Processing: geoviews 1.8.1
Eingesetzte SoftwareResource Processing: geopandas 0.8.1
Eingesetzte SoftwareResource Processing: Fiona 1.8.17
Eingesetzte SoftwareResource Processing: Cartopy 0.18.0
Eingesetzte SoftwareResource Processing: bokeh 2.2.1
Eingesetzte SoftwareOther: Postgres 13
Eingesetzte SoftwareResource Production: postgresql-hll v2.14
Eingesetzte SoftwareOther: Jupytext 1.14.0
Eingesetzte SoftwareOther: JupyterLab v3.5.0
Nähere Beschreibung der/s Fachgebiete/sSocial Cartography
Angabe der FachgebieteGeological Sciencede
Angabe der FachgebieteComputer Sciencede
Angabe der FachgebieteInformation Technologyde
Angabe der FachgebietePsychologyde
Angabe der FachgebieteSoftware Technologyde
Angabe der FachgebieteSocial Sciencesde
Angabe der FachgebieteEnvironmental Science and Ecologyde
Angabe der FachgebieteBehavioural Sciencesde
Titel des DatensatzesSupplementary materials for the publication "From sunrise to sunset: Exploring landscape preference through global reactions to ephemeral events captured in georeferenced social media"


Dateien zu dieser Ressource

Thumbnail
Thumbnail
Thumbnail

Die Datenpakete erscheinen in:

  • Supporting Information: From sunrise to sunset - Exploring landscape preference through global reactions to ephemeral events captured in georeferenced social media [1]Open Access Icon
    This collection contains Supporting Information for the publication "From sunrise to sunset - Exploring landscape preference through global reactions to ephemeral events captured in georeferenced social media" (PLOS). Abstract: Events profoundly influence human-environment interactions. Through repetition, some events manifest and amplify collective behavioral traits, which significantly affects landscapes and their use, meaning, and value. However, the majority of research on reaction to events focuses on case studies, based on spatial subsets of data. This makes it difficult to put observations into context and to isolate sources of noise or bias found in data. As a result, inclusion of perceived aesthetic values, for example, in cultural ecosystem services, as a means to protect and develop landscapes, remains problematic. In this work, we focus on human behavior worldwide by exploring global reactions to sunset and sunrise using two datasets collected from Instagram and Flickr. By focusing on the consistency and reproducibility of results across these datasets, our goal is to contribute to the development of more robust methods for identifying landscape preference using geo-social media data, while also exploring motivations for photographing these particular events. Based on a four facet context model, reactions to sunset and sunrise are explored for Where, Who, What, and When. We further compare reactions across different groups, with the aim of quantifying differences in behavior and information spread. Our results suggest that a balanced assessment of landscape preference across different regions and datasets is possible, which strengthens representativity and exploring the How and Why in particular event contexts. The process of analysis is fully documented, allowing transparent replication and adoption to other events or datasets.

Zur Kurzanzeige