Q2. What future works have the authors mentioned in the paper "Flink: semantic web technology for the extraction and analysis of social networks" ?
While technology is important, keeping in touch with social science will be just as important in the future. Creating a social ontology that would allow to classify social relationships along several dimensions is among the future work and so is the finding of patterns for identifying these relationships using electronic data. However, networks themselves may also be the subject of much debate in the future, especially if these sources were originally created for a different purpose, and thus their integration could not have been foreseen. For example, a practical question the authors encountered in their work concerns the multiplexity of social relations: a relationship between two individuals may have a different significance to different areas of social life.
Q3. What is the current bottleneck in scalability?
In terms of technology, the current bottleneck in scalability is the performance of aggregation (identity reasoning) due to the lack of standard query and rule languages and efficient implementations in RDF stores.
Q4. What is the key idea in the structural approach to social science?
A key idea in the structural approach to social science is that the way an actor (an individual or a group) is embedded in a network offers opportunities and imposes constraints on the actor.
Q5. What is the primary reason why the authors cannot benefit from using HayStack?
The uniqueness of presenting social networks is also the primary reason that the authors cannot benefit from using Semantic Web portal generators such as HayStack [5], which are primarily targeted for browsing more traditional object collections.
Q6. What is the main purpose of the web mining component of Flink?
The web mining component of Flink employs a co-occurrence analysis technique first applied to social network extraction in the work of Kautz et al. [14].
Q7. What is the main reason why the interface is important?
The authors consider the flexibility of the interface important because there many possibilities to present social networks to the user and the best way of presentation may depend on the size of the community as well as other factors.
Q8. What types of knowledge sources is used by Flink?
Flink uses four different types of knowledge sources: HTML pages from the web, FOAF profiles from the Semantic Web, public collections of emails and bibliographic data.
Q9. What is the alternative source of information from emails?
An alternative source of bibliographic information (used in previous versions of the system) is the Bibster peer-to-peer network [9], from which metadata can be exported directly in the SWRC ontology format.
Q10. What is the danger of a close mapping between the ontology and the run-time?
The danger of a close mapping between the ontology and the run-time model is that the application needs to be rewritten whenever the underlying ontology changes.
Q11. What is the disadvantage of rule-based expansion of equivalence?
The rule-based expansion of equivalence has the disadvantage that it requires the storage of the same information about all the equivalent instances.
Q12. What is the purpose of the web mining component?
The web mining component also performs the additional task of finding topic interests, i.e. associating researchers with certain areas of research.
Q13. What is the way to store data on the scale of millions of triples?
From a scalability perspective, the authors are glad to note that the Sesame server offers very high performance in storing data on the scale of millions of triples, especially using native repositories.
Q14. How can a developer improve the performance of a query?
In many cases, the developer himself can improve the performance of a query by rewriting it manually, e.g. by reordering the terms or breaking the query in two.
Q15. Why did the authors increase in importance in the last years?
Their social connectivity might have even increased in importance in the last years simply by the virtue of the information overload the authors are facing.
Q16. What is the trade-off between executing a single large query and a single large?
The trade-off is in terms of memory footprint versus communication overhead: small, targeted queries are inefficient due to the communication and parsing involved, while large queries produce large result sets that need to be further processed on the client side.