Identifying Effective Factors in Information Retrieval from Linked Databases in Virtual Libraries

Document Type : Research Paper

Authors

1 Department of Knowledge and Information Science, Bab. C., Islamic Azad University, Babol, Iran

2 Department of Knowledge and Information Science, Bab.C, Islamic Azad University, Babol, Iran

3 ., Department of Knowledge and Information Science, Bab.C, Islamic Azad University, Babol, Iran

10.30473/mrs.2026.76290.1672

Abstract

Introduction
Semantic web and linked data technologies offer transformative potential for knowledge organization in digital libraries, enabling machine-readable connections that enhance discoverability and interoperability. Virtual libraries, as major digital repositories, are ideal for implementing these technologies. However, their practical adoption, especially in contexts like Iran, faces significant challenges. These include data quality issues, metadata inconsistencies, vocabulary mismatches, structural heterogeneities, and language-specific barriers. For Persian-language resources, unique orthographic features and native subject systems further complicate semantic linking and increase error rates. Additionally, gaps exist between the sophistication of linked data systems and the skill levels of both end-users and information professionals, while user interfaces often fail to support the exploratory search that linked data enables. Prior research lacks a systematic, holistic investigation of factors affecting information retrieval from linked databases in Iranian virtual libraries. Consequently, this study aimed to identify these critical factors and develop a comprehensive conceptual model illustrating their dynamic interrelationships to provide a strategic framework for both theory and practice.
Methodology
This study employed a qualitative methodology based on the grounded theory approach by Glaser and Strauss. This interpretive, inductive paradigm was chosen for its suitability in exploring a complex phenomenon where pre-existing theory is limited, allowing theory to emerge directly from empirical data. The target population consisted of experts with substantial knowledge and experience in Information Science, Information Technology, and Computer Science. Participants were selected via purposive and theoretical sampling, with strict inclusion criteria: a minimum doctoral degree and at least five years of direct experience with linked data, digital libraries, or semantic information retrieval. Data were collected through 15 in-depth, semi-structured interviews, continuing until theoretical saturation was achieved after the twelfth interview, with three additional confirmatory interviews. Interviews focused on participants' experiences, perceptions, and insights regarding enablers and barriers in linked data information retrieval. Data analysis followed the grounded theory stages of open, axial, and selective coding. Open coding involved line-by-line analysis to identify initial concepts. Axial coding refined these concepts into broader categories and subcategories, exploring their relationships. Selective coding integrated all categories into a coherent theoretical model. To ensure trustworthiness, validation techniques included member checking, peer review, detailed methodological documentation, and test-retest procedures for coding consistency.
Findings
The rigorous analysis of the qualitative data yielded a rich and nuanced understanding of the ecosystem surrounding information retrieval from linked databases. The findings culminated in the identification of four principal, interdependent categories of factors that collectively determine the efficacy and success of the retrieval process. These categories form the pillars of the proposed conceptual model, emphasizing that optimal performance requires their synergistic integration rather than isolated optimization:
Technical Factors
 This dimension covers the essential infrastructure and system capabilities needed to store, process, and query linked data effectively and dependably. It consists of four key components. First, Infrastructure Stability and Performance involves critical requirements like high system availability, fast response times, strong processing power for complex queries, scalability for increasing data, and reliable backup systems. Second, Query Efficiency and Optimization focuses on the system's internal sophistication, including advanced indexing, query optimizers for efficient searches, caching for common queries, and detailed performance analysis. Third, Quality of Links to External Data is central to linked data's value, dealing with the breadth, reliability, and maintenance of connections to outside datasets—encompassing link quantity, authority, consistency, and upkeep. Fourth, Advanced Semantic Capabilities refers to intelligent system features, such as reasoning engines for drawing conclusions, support for custom inference rules, navigation of ontology structures during search, and tools like fuzzy text search.
Content Factors
This dimension concerns the inherent quality, structure, and currency of the data within the linked database. High-quality content is vital for any effective retrieval system and includes four aspects. Data Accuracy and Correctness is the basic requirement that data be error-free, factually correct, validated before entry, and supported by user error-reporting and duplicate minimization. Semantic Richness and Depth goes beyond accuracy to involve a rich data model, featuring diverse entity relationships, detailed ontologies, deep classification systems, thorough descriptions, and varied data types like geographic or temporal information. Controlled Vocabularies involve systematic terminology management using standards such as SKOS, including synonym networks, multilingual support, selection guidance, and links to external ontologies for better interoperability. 
Human and Organizational Factors
This category emphasizes the crucial role of people, their skills, and the institutional environment that supports them. Technology and content alone are insufficient without skilled human involvement and organizational backing, divided into four elements. User's Linked Data Literacy involves the user's basic understanding of linked data concepts like entities and RDF, ability to read data visualizations, skill in forming simple queries, and knowledge of ontologies. User's Information and Subject Literacy covers broader information skills, such as clearly defining information needs, effectively using keywords, critically evaluating results, and having sufficient topic knowledge. Active Role of Librarians sees librarians shifting to key mediators, requiring deep awareness of linked data capabilities, ability to help users craft queries, leadership in training programs, and bridging user needs with technical complexity. 
User Interface and User Experience (UI/UX) Factors
This dimension focuses on the design and functionality of the system's front-end, which directly handles all human-computer interaction. A powerful backend becomes inaccessible without an intuitive and supportive interface, defined by four characteristics. User-Friendly Design follows principles of clarity, with logical layout and navigation, responsiveness across devices, familiar visual elements, and accessibility standards for inclusivity. Advanced Search Features go beyond basic search boxes to offer tools like graphical query builders that simplify complexity, faceted filtering options, and capabilities to save, reload, and share search strategies. Results Visualization and Display takes advantage of linked data's graph nature by providing visual representations of results and their connections, alongside standard list and table views, plus mapping for spatial data and explanations for inferred results. Support and Guidance includes integrated help systems like context-sensitive assistance, pre-made query examples, chatbots or FAQs for quick solutions, and easy feedback and error-reporting channels.
Discussion and Conclusion
The study demonstrates that effective information retrieval from linked databases is not merely a technical challenge but a complex, multi-dimensional socio-technical phenomenon. Success depends on the harmonious and dynamic integration of four inextricably linked pillars: Technical Robustness, Content Credibility, Human-Organizational Competence, and User-Centric Design. The proposed model posits that these dimensions interact systemically; weakness in any one fundamentally undermines the entire system. For example, an advanced semantic reasoner (Technical) is useless with inaccurate data (Content), and perfectly structured data (Content) is inaccessible without user literacy (Human) or a clear interface (UI/UX).

Keywords

Main Subjects


Almeida, M., Souza, R., & Silva, T. (2018). Semantic interoperability in digital libraries: A framework for linked data. Journal of Information Science, 44(5), 679-691. https://doi.org/10.1177/0165551517748463
Álvaro Tejeda-Lorente, C., Porcel, C., Peis, E., Sanz, R., & Herrera-Viedma, E. (2014). A quality-based recommender system to disseminate information in a university digital library. Information Sciences, 261, 52-69.https://doi.org/10.1016/j.ins.2013.08.035
Application of Linked Data Technologies in Digital Libraries: A Review of Literature. (2019). Journal of Library and Information Services, 45(3), 123–140.
Arenas, J. L., Rodreguez, J. V., Gomez, J. A., & Arenas, M. (2004). Information literacy: Implications for Mexican and Spanish university students. Library Review, 53(9), 451-460. https://doi.org/10.1108/00242530410565241
Battle, J. C. (2004). The effect of information literacy instruction on library anxiety among international students [Doctoral dissertation, University of North Texas]. ProQuest Dissertations and Theses Global.
Carpineto, C., & Romano, G. (2012). A survey of automatic query expansion in information retrieval. ACM Computing Surveys (CSUR), 44(1), 1-50. https://doi.org/10.1145/2071389.2071390
Chervany, N., & Dickson, G. W. (2004). An experimental evaluation of information overload in a production environment. Management Science, 20(10), 1335-1344.
https://doi.org/10.1287/mnsc.20.10.1335
Coyle, K. (2010). Metadata models of the World Wide Web. Library Technology Reports, 46(2), 12-19.
De la Rosa, J., & Hernández, F. (2018). Linked data technologies in digital libraries: Opportunities and challenges. Library Hi Tech News, 35(10), 8–13. https://doi.org/10.1108/LHTN-06-2018-0036
Empowering Linked Data in Cultural Heritage Institutions: A Knowledge Management Perspective. (2022). International Journal on Digital Libraries, 23(4), 211–228.
https://doi.org/10.1007/s00799-022-00339-w
Eslami, M., & Vaghefzadeh, S. (2013). Publishing national authority data as Linked Data: Challenges and opportunities. Iranian Journal of Information Science, 10(2), 35–52.
Farhoomand, A., & Drury, H. D. (2012). Managerial information overload. Communications of the ACM, 45(10), 127-131. https://doi.org/10.1145/570907.570909
Ghafari, P., & Ghiasabadi, M. (2023). Identifying factors affecting the Management and Organization of Businesses based on e-commerce through a Meta-synthesis Approach.‏ Business Intelligence Management Studies, 13(49), 77-118. (In Persian). https://doi.org/10.22054/ims.2023.75559.2376.
Ghorbani Bousari, R., Ghiasi, M., & Razavi, A. (2021). Challenges of linked data adoption in Iranian libraries. Library and Information Science Research, 43(2), 101–112. (In Persian). 10.22054/dcm.2021.13678
Goddard, L., & Byrne, G. (2010). The strongest link: Libraries and linked data. D-Lib Magazine, 16(11/12). https://doi.org/10.1045/november2010-goddard
Gonçalves, M. A., Fox, E. A., Watson, L. T., & Kipp, N. A. (2004). Streams, structures, spaces, scenarios, societies (5S): A formal model for digital libraries. ACM Transactions on Information Systems, 22(2), 270–312.
https://doi.org/10.1145/984321.984325.
Hamavandi, H., Nowroozi, Y., & Hosseini Beheshti, M. (2018). Investigating search and retrieval problems In Persian databases from the perspective of Persian language orthographical features. Information Processing and Management Research Journal, 33(3), 1087–1110. (In Persian). 10.3390/computers13080212
Ingwersen, P., & Järvelin, K. (2005). The turn: Integration of information seeking and retrieval in context. Springer. https://doi.org/10.1007/1-4020-3851-8
Isaac, A., & Baker, T. (2015). Linked data practice at Europeana. Library Hi Tech, 33(4), 559–574.
https://doi.org/10.1108/LHT-09-2015-0091
Jiao, Q. G., Onwuegbuzie, A. J., & Bostick, S. L. (2006). The relationship between race and library anxiety among graduate students: A replication study. Information processing & management42(3), 843-851. https://doi.org/10.1016/j.ipm.2005.04.003
Jokar, A., & Motamedi, F. (2016). The extent to which Shiraz University students seek librarians' help in conducting library research. Quarterly of Information Science, 1-2(34-25), 18. (In Persian)
Kobayashi, M., & Takeda, K. (2000). Information retrieval on the web. ACM Computing Surveys (CSUR), 32(2), 144-173. https://doi.org/10.1145/358923.358934
Liu, J., Kong, X., Zhou, X., Wang, L., Zhang, D., Lee, I., Xu, B., & Xia, F. (2019). Data mining and information retrieval in the 21st century: A bibliographic review. Computer Science Review, 34, 100193. https://doi.org/10.1016/j.cosrev.2019.100193
Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. Cambridge University Press.https://doi.org/10.1017/CBO9780511809071
Mitchell, E., Durner, A., & Proffitt, M. (2018). BIBFRAME and linked data for libraries. Journal of Library Metadata, 18(3–4), 165–179. https://doi.org/10.1080/19386389.2018.1540003
Mohammad Roozbeh Pour, N., Nazari, F., & Behnam, M. (2017). Determining factors related to research anxiety among faculty members of Information Science & Knowledge Studies departments in Iranian universities. Caspian Journal of Scientometrics, 4(1), 17–25. (In Persian). 10.22088/cjs.4.1.17
Montazer, G., Nasiri Saleh, F., & Fathian, M. (2017). Designing a model for developing information literacy in Iran. Research and Planning in Higher Education, 13(2), 109–131. (In Persian)
Mulder, I., et al. (2010). An information overload study: Using design methods for understanding. ACM Digital Library. https://doi.org/10.1145/1858171.1858217
Parnell, N. (2001). Managing information overload. Business Information Review, 18(3), 38-43. https://doi.org/10.1177/0266382014231085
Patel, S., Singh, R., & Kaur, G. (2020). Enhancing information retrieval in digital libraries using semantic technologies. International Journal of Digital Library Systems, 11(1), 45–60. https://doi.org/10.4018/IJDLS.2020010104
Raza, Z., Mahmood, K., & Warraich, N. F. (2019). Application of linked data technologies in digital libraries: a review of literature. Library Hi Tech News, 36(3), 9-12.‏ https://doi.org/10.1108/LHTN-11-2018-0073
Saravana Kumar, C., & Santhosh, R. (2020). Effective information retrieval and feature minimization technique for semantic web data. Computers & Electrical Engineering, 81, 106518. https://doi.org/10.1016/j.compeleceng.2019.106518
Savolainen, R. (2017). Filtering and withdrawing: Strategies for coping with information overload in everyday contexts. Journal of Information Science, 33(5), 611-621.
https://doi.org/10.1177/0165551506077418
Shirzad, M., Sohrabzadeh, S., Kamarkhani, H., & Mozafari, A. (2016). Challenges of information retrieval management in digital libraries . The First International Conference on Interactive Information Retrieval, Kish, Iran. (In Persian)
Soodbakhsh, L., & Nikkaar, M. (2015). The effect of information literacy skills training on the information-seeking behavior of students. Quarterly Book, 63, 53–58. (In Persian). 10.4103/2277-9531.171789
Tavousi, M., & Ghiasabadi Farahani, M. (2025). Presenting a model for empowering the local community in the development of ecotourism in the Semiram region of Isfahan, based on the grounded theory. Journal of Tourism and Development, 14(1), 145-163. (In Persian). https://doi.org/10.22034/jtd.2025.448089.2629
Wang, X., & Yang, L. (2018). Linked data technologies for semantic web: A survey. Journal of Web Engineering, 17(5–6), 389–415. https://doi.org/10.13052/jwe1540-9589.175613
Windah, A., Nurhaida, I., Putra, P., Purnamayanti, A., & Maryani, E. (2025).Global Synergy, Local Impact: Optimizing Information Retrieval In Lampung Community Libraries Through Information Literacy Training Program By Lampung University and Charles Sturt University. International Journal Of Community Service, 5(2), 109-117. https://doi.org/10.55299/ijcs.v5i2.878
Yari Zangeneh, M. (2018). The role of affective aspects in web information search and retrieval performance: A case study of PhD students in various fields of humanities, basic sciences, and engineering [Unpublished doctoral dissertation, University of Tehran]. (In Persian)
Zeng, M. L. (2019). Semantic enrichment for improving retrieval in digital libraries. Cataloging & Classification Quarterly, 57(3–4), 127–145. https://doi.org/10.1080/01639374.2019.1570418
Zhang, L. (2022). Empowering linked data in cultural heritage institutions: A knowledge management perspective. Data and Information Management, 6(3), 100013.‏https://doi.org/10.1016/j.dim.2022.100013