AustLII Home | Databases | WorldLII | Search | Feedback

Journal of Law, Information and Science

Journal of Law, Information and Science (JLIS)
You are here:  AustLII >> Databases >> Journal of Law, Information and Science >> 2013 >> [2013] JlLawInfoSci 9

Database Search | Name Search | Recent Articles | Noteup | LawCite | Author Info | Download | Help

Burda, Daniel; Teuteberg, Frank --- "Why Discard When You Can Keep Them? A Case Study on the E-Mail Retention Behaviour in Firms" [2013] JlLawInfoSci 9; (2013) 22(2) Journal of Law, Information and Science 183


Why Discard When You Can Keep Them? A Case Study on the E-Mail Retention Behaviour in Firms

DANIEL BURDA[*] AND FRANK TEUTEBERG[**]

Abstract

Firms are increasingly required to consciously retain and dispose of specific information as part of an effort to ensure compliance with legal and regulatory mandates. While e-mails represent a major part of all corporate records, they can be used as electronic evidence in legal investigations and compliance audits. However, the decision towards e-mail retention or disposal is often incumbent upon employees in the course of performing their jobs. This paper presents the results of a case study seeking to uncover how and why employees retain e-mails. We employ qualitative and quantitative data collection methods, thereby analysing mailboxes of 20 employees and more than 700,000 e-mails. Our findings point towards different types of employee behaviour and a fractional tendency to hoard vast amounts of e-mail pursuing a ‘keep everything forever’ mentality. Based on the consolidated findings, we elaborate a set of propositions, highlight the organisational implications and suggest opportunities for future research.

Introduction

In the light of the recent rise and increasing diffusion of social media in the corporate context, e-mail might appear an outdated and old-fashion means of business communication. However, there is no doubt that e-mail is still the pervasive and ubiquitous application that interconnects the majority of business information and corporate communication.[1] E-mail has also gained a reputation as a ‘smoking gun’.[2] Anecdotal evidence from the practitioner’s community and market studies suggests that e-mail is one of the most important means in the preparation of legal evidence in litigation and regulatory investigations.[3] Consistently, it can be observed from recent litigation that firms have to reckon with being accused based on evidence found in e-mail. Examples are the US government’s lawsuit against Standard & Poor’s[4] and the dispute between Oracle and Google.[5] Similarly, as other prominent examples such as ING[6] and Morgan Stanley[7] show, firms risk being fined when they fail to retrieve retained e-mails upon the request of regulators or the courts. In both cases the firms had to pay penalties of up to USD15 million, because they were not able to retrieve retained e-mails. Today, risks related to compliant e-mail retention are a growing challenge for many firms, since these risks may have decisive, not exclusively monetary consequences; they may also result in a loss of credibility and reputation.[8]

According to the private sector research firm Gartner, software vendors have responded to these demands by offering extended archiving products referred to as Enterprise Information Archiving (EIA) solutions. Those solutions support archiving of electronically stored information (ESI), such as e-mail and provide e-discovery functionality as well as policy-based mailbox management.[9] Although firms are investing in EIA solutions to manage their ageing data assets, the role of people, decision rights and policies is emphasised in an attempt to ensure the effective retention of a firm’s information assets in line with business, legal and regulatory objectives.[10] On the other hand, it is recognised that people rather tend to overvalue information, which seems to foster a tendency to amass rather than discard digital information.[11] Bearing in mind that a decision towards e-mail retention is often held by the end-user due to the lack of respective policies[12] and thus might be up to chance, this points towards considerable organisational implications between the poles of human behaviour and corporate objectives.[13] Guided by the following research question, it is thus the intent of this study to provide an understanding of employee’s e-mail retention behaviour in a firm: How and why do employees retain corporate e-mails?

In this case study, we address our research question by analysing a set of 20 mailboxes of employees in a major software firm. The paper is structured as follows: In following section we present related research on the relevant topics and highlight the research gap. Next, we describe our research methodology followed by the presentation of our findings. Then, we discuss our findings and suggest a set of propositions. Finally, we conclude this study by elaborating the study’s implications and opportunities for future research.

1 Related Work

A review of the extant literature revealed only one publication explicitly focused on email retention and governance issues. Knolmayer et al’s[14] study proposes indicators to develop a maturity model for e-mail governance and examines the e-mail governance maturity of various firms. Based on their empirical examination, they conclude that firms are struggling to implement robust policies for handling, archiving and deleting e-mails. Moreover, Volonino[15] presents an overview of computer forensics to encourage research into e-mail archives and e-records management. According to Volonino, currently e-mail is considered one of the primary sources of e-evidence in many legal actions while IT departments are rarely prepared for the issues that e-discovery impose on active/archival data operations. Similarly, Ward et al describe the organisational challenges in responding to e-discovery requests in a timely and cost-effective manner: ‘While storage costs of ESI may be inexpensive, managing ESI is not particularly so when the company has not implemented rigorous policies on e-mail usage ... and ESI document retention as part of a litigation readiness program.’[16] A challenging aspect of e-discovery is data collection that nowadays runs into terabytes of data, including e-mails, which makes a systematic and cost-effective retention management a necessity.[17] E-mail metadata, such as sender and subject line information, have to be kept online and be easily accessible and searchable for all types of regulatory, audit and legal inquiries.[18] According to a Gartner study, the increase in legal discovery associated with e-mail has driven the demand for new e-mail archiving applications, which is recognised as one of the fastest growing segments in the software market. E-mails are considered to consume large amounts of storage and IT budgets and requirements for archiving, e-discovery and compliance add additional cost to the management of e-mail.[19]

Another stream of research relevant to this study focuses on the examination of behaviour in relation to individual e-mails. This research, mostly stemming from the domain of Human Computer Interaction (HCI), follows a typical design cycle to derive user requirements from which new system features, that improve e-mail solutions,[20] can be developed. For instance, Whittaker and Sidner empirically investigated the use of e-mail applications.[21] Based on a mailbox analysis of employees, they found that employees maintain an average of 47 folders and keep 2 482 e-mails, 34 per cent of which are older than three months. They identified three e-mail filing strategies, namely, no filing, spring-cleaning and frequent filing. From these strategies they derived functional requirements to redesign e-mail applications. Ten years later, Fisher et al conducted a similar study to compare their findings with 1996.[22] They found that inboxes have roughly the same amount of items, but employee’s e-mail archives have grown tenfold with a mean at 28 660 e-mail items. According to Fisher et al, 43 per cent of all items were older than three months while the number of folders increased to 133 in comparison to 1996. Other studies examine individual differences in dealing with e-mail messages,[23] the problem of e-mail overload[24] and the role of e-mail in task management.[25]

While acknowledging the amount of existing research, our review shows that the examination of e-mail governance, and more specifically the challenges of e-mail retention, is still scarce. It is thus the intention of this paper to address this research gap while focusing on the individual behaviour of employees in retention and deletion of e-mails and its implications for the organisation.

2 Research Methodology

For this research endeavour, we decided to conduct a case study for the following reasons. Case studies are deemed an appropriate method for investigating ‘why’ and ‘how’ research questions as well as ‘sticky, practice-based problems where the experiences of the actors are important and the context of action is critical’.[26] Moreover, there is little research and sound theoretical knowledge available on the topic of employee’s e-mail retention behaviour. As such, the nature of this case study is more exploratory, seeking to establish a foundation for future research by documenting the experiences and knowledge gained from practice. Guided by our research questions, we apply an approach referred to as ‘soft-positivism.’[27] This approach enables us to draw from a positivist view, which assumes that e-mail retention behaviour is a relatively stable and an objectively existing phenomena, while allowing other constructs to emerge from the collected data. On the other hand and in line with an interpretive perspective, we also allow other constructs that emerge from the data to surface. Our overall approach is described in the following subsections and represents the case study protocol.

2.1 Unit of Analysis and Case Selection

The unit of analysis of the present study is an employee’s e-mail retention behaviour in a firm. We selected a single-case design with multiple embedded units of analysis representing the participating employees while the firm represents the single case sample being constant for all embedded units. This design increases the evidential significance of our findings and the study’s external validity since it supports the replication of results.[28] To address our research questions, we followed the ‘typical case’ sampling strategy.[29] We observed a multi-national software development firm where e-mail provides the typical means of internal and external communication, scheduling and calendaring and is used by every employee on a daily basis in the course of business. The globally operating firm is based in continental Europe and employs more than 55 000 people in more than 120 countries. In addition to various geo-specific legislation, the firm is obliged to comply with regulations prescribed by the Sarbanes-Oxley Act of 2002 (SOX).[30] The firm uses Microsoft Outlook Exchange as their corporate-wide e-mail system. We were granted access to the research site and thus could directly ask employees from different departments for their participation in this study while applying the snowballing strategy to win additional participants. In total we were able to acquire 20 participants from three different departments with an average affiliation with the firm of 7.3 years. Table 1 provides an overview of participant profiles and department affiliation.

<O
Department
Age [years]
Manager
Gender
Research(R)
Consulting(C)
Project Mgmt (P)
26-35
36-45
46-55
56-65
yes
no
f
m
Frequency
11
8
1
11
7
1
1
3
17
4
16
Percentage
55
40
5
55
35
5
5
15
85
20
80

Table 1 Participant Demographics Overview

2.2 Data Collection and Data Analysis

Yin proposes three principles of data collection to increase the robustness of the results, namely: (1) use of multiple sources of evidence, (2) creation of a case study database and (3) to maintain the chain of evidence.[31] Following the first principle and in line with Kaplan and Duchon,[32] we used a combination of qualitative and quantitative methods to collect data. The data collection took place between June and August 2012 and included unstructured interviews, participant observation, a tool-supported mailbox analysis and a survey questionnaire to overcome reported issues in prior research regarding information about deleted e-mails that are difficult to capture.[33] As the firm under study employs Microsoft Outlook Exchange, we decided to develop a macro in Visual Basic for Applications (VBA) to ease the data collection without being required to install other software components on the participant’s computers. The developed macro automatically captures an employee’s mailbox data by reading all online (ie on the server) and offline (ie in a local archive) stored e-mail items and folders including their metadata such as size, last modification or received date/time. Depending on the amount of e-mails a specific user retained, the runtime of the macro varied between 10 and 150 minutes. We scheduled personal meetings or telephone calls with each of the participants where we introduced them to the scope of the study. We explained the approach and emphasised the respect of data anonymity and privacy which is reportedly a major concern in such studies.[34] Subsequently, we deployed the macro in their local Microsoft Outlook client and started the automatic data collection. During or after the analysis we interviewed the participants with regards to their personal reasons for e-mail retention and their usage of the archiving functions in Outlook. During the interviews participants also presented their Outlook client enabling us to observe the way they retain and file e-mails. We documented the results of the interviews and observations by writing and took field notes after the meetings took place.

In line with the second data collection principle, we created a case study database where we stored all data for subsequent analysis. In order to maintain a chain of evidence (third principle), we aggregated the quantitative mailbox data consisting of 718 783 single e-mail items stepwise. We thereby stored a snapshot of each step to allow tracing back and forth between the raw data and aggregations and eventually our interpretations. In an effort to increase the ‘objectivity’[35] of our findings and ease the comparison between participants by the means of quantitative data, we decided to survey the participants with an online questionnaire. To develop the questionnaire, we started augmenting the interview transcripts and field notes with ‘reflective remarks’[36] and research literature. We commenced open coding, whereby statements in the transcripts and field notes pertaining to some reasons for retention of e-mails were used, to form themes and categories.[37] Those themes defined the focus of the online questionnaire and guided its development. Before we started the survey, the questionnaire was reviewed by two research colleagues. According to their comments, we revised the wording of some questions to improve the clarity and adjusted the sequence of questions. Based on the set of data collected during the interviews and the survey responses, we conducted the analysis of the overall data in response to our research question.

3 Findings

3.1 How Do Employees Retain E-mails?

The studied firm has a set of global policies in place, such as an information security policy, travel policy and purchasing policy, that are binding to all employees. However, during the time of data collection, there was no formal policy defined that governs the retention and disposal of e-mails. Every employee has a server size quota that restricts the total size of e-mails to be stored on the server to 215 MB while all received and sent e-mail is kept by default until the user explicitly deletes it. Once the mailbox size reaches 90 per cent of the server quota, users are automatically notified via e-mail to delete unrequired items. Once the user’s mailbox size equals the given quota of 215 MB, he/she will not be able to send and receive any e-mails. By that time, the user has to make a decision about whether and what e-mails to discard or to retain. To archive specific e-mails in Outlook the user can create a local archive, ie, personal-storage-table (PST file) that is stored on the user’s local hard drive. All items to be retained can manually or automatically (auto archiving) be moved from the server to the local archive (offline archive) — the latter option is only used by 30 per cent of the employees in our sample. As a consequence of this configuration, users have the freedom to decide what e-mail to retain or discard since there is no policy guiding the decision. From a technical perspective, there is practically no storage limit on the local hard drive and backup server that restricts the amount of retained e-mails. Table 2 provides an aggregated excerpt of the mailbox analysis results including the total, mean, median, minimum, maximum and standard deviation (SD) values for each user while the first digit in the user ID indicates the organisational unit of the user (column A).

A
B
C
D
E
F
G
H
I
J
K
User Attributes
Mailbox Characteristics
Archive Characteristics
User ID
Affiliation with the Firm [year]
Date of Oldest E-mail in Mailbox
Age of Mailbox [year]
Total Number of E-mails
Total Size of E-mails [MB]
Total Number of Archived E-mails
Percentage of E-mails Older 3 Months
Percentage of Archived E-mails
Archived E-mails per Month of Affiliation
Regularly Backup
C01
5.8
27.09.06
5.8
72 450
5452.70
70 893
97.70%
97.90%
1027
yes
R02
4.3
26.03.08
4.3
50 188
5544.40
49 282
94.30%
98.20%
948
yes
R03
15
25.01.99
13.4
48 581
7837.70
47 817
95.90%
98.40%
266
yes
C04
4.3
28.04.08
4.2
36 737
5275.40
34 840
90.30%
94.80%
683
yes
R05
5.7
31.10.06
5.7
29 652
3906.70
24 364
90.00%
82.20%
358
yes
C06
10
03.04.07
5.3
25 427
3813.00
24 702
90.70%
97.10%
206
yes
R07
5.8
29.09.06
5.8
17 533
1589.90
14 434
88.40%
82.30%
206
yes
R08
8
04.04.11
1.2
12 220
1077.80
9745
60.00%
79.70%
102
yes
R09
10.7
04.03.08
4.3
10 130
1174.30
7190
78.50%
71.00%
56
yes
P10
4.3
14.03.08
4.3
8608
1090.50
6821
83.30%
79.20%
134
yes
C11
7.5
18.07.05
7
4871
194.4
1356
68.50%
27.80%
15
no
R12
4.6
02.01.08
4.5
90 110
8898.90
86 241
93.10%
95.70%
1568
yes
C13
4.1
30.05.08
4.1
16 237
2488.30
15 710
95.90%
96.80%
321
yes
R14
4.4
27.07.08
4
35 629
3563.30
34 960
97.50%
98.10%
660
yes
R15
7.8
27.09.04
7.8
89 967
10 186.20
89 369
92.60%
99.30%
961
yes
R16
2.4
01.02.10
2.4
5332
929.3
4294
89.80%
80.50%
148
yes
C17
12
28.08.00
11.9
45 776
4814.50
44 543
96.80%
97.30%
309
no
R18
17.1
11.11.99
12.7
90 272
10 289.00
89 465
95.80%
99.10%
436
yes
C19
6
11.09.09
2.9
13 755
2179.10
12 647
71.70%
91.90%
176
yes
C20
6.5
29.12.05
6.5
15 308
1249.50
7892
59.60%
51.60%
101
yes
Total



718 783
81 554.90
676 565
92.00%
94.10%
8681
Mean
7.3

5.9
35 939.10
4077.70
33 828.30
86.50%
86.00%
434
Median
5.9
4.9
27 539.50
3688.10
24 533
90.50%
95.30%
287
Min
2.4
25.01.99
1.2
4871
194.4
1356
59.60%
27.80%
15
Max
17.1
04.04.11
13.4
90 272
10 289.00
89 465
97.70%
99.30%
1568
SD
3.9
3.3
29 264.20
3173.30
29 733.10
12.3
18.4
413

2013_900.jpgTable 2: Excerpt of Aggregated Mailbox Analysis Results (n=20)

3.1.1 Mailbox Characteristics

As can be seen from Table 2, the 20 participants of our study keep 718 783 e-mails in total (column E) that are up to 13.4 years old (column D) and consume 81 554.9 MB (~ 80 gigabyte) storage in total (column F). At the lower and upper ends, the size of the mailbox per user ranges from 194.4 MB (C11) to 10 289 MB (R18) while the average mailbox size equals 4077.7 MB.

Acknowledging the total number of e-mails held by a specific user, we find a range between 4871 e-mails at the lower end and 90 272 e-mails at the upper end, which equals a difference by a factor of 18.5. Moreover, it can be seen that in 11 out of 20 cases the age of the mailbox (column D), determined by the date of oldest e-mail captured in the mailbox (column C), only differs slightly (=< 0.1 years) or even equals the term of affiliation with the firm (column B) as, eg, in the case of C01, R02 or C20. The reasons for significant differences between the mailbox age and the term of affiliation can be accounted for by spring cleaning actions (eg, R06/08) or loss of e-mails due to job change associated with a location and hardware change (R09/18). Comparing the age of mailboxes, we find an increase in contrast to Fisher et al.[38] It can be observed from column D that 75 per cent of all participants retain e-mails dating back at least 4.2 years, 50 per cent for at least 5.3 years and 25 per cent for at least seven years. In line with this finding, it can be seen from column H that in total 92 per cent of all retained e-mails are older than three months, ie, the e-mail has been sent or received at least three months ago. Those items account for 93.2 per cent of the total storage need for e-mails.

3.1.2 Archive Characteristics

We define an archived e-mail as being stored offline on the local hard drive in a PST-file that can only be accessed with the local outlook client on the respective user’s computer. In contrast, online stored e-mails can be accessed by the employee via mobile devices or a webmail portal. Those archived items are subject to the user’s individual decisions regarding backup (column K), security measures, deletion or permanent retention. Recognising the percentage values in column I, we find that 94.1 per cent, ie, 676 565 of all 718 783 analysed e-mails, are archived by the user and account for 79 008.1 MB (96.6 per cent) of the total storage demand. However, considering the different terms of affiliation with the firm, we standardised the total number of archived e-mails by the months of affiliation of each employee (data collected via survey) as can be seen from column J. The range of relative archived item count of a specific user varies by a factor of 52 between a minimum of 15, maximum of 1568 and results in an average of 434 archived e-mails per user. Moreover, we calculated the average archive growth based on 10 samples providing us with data of at least five years and observe that the archive size has grown 6.5 fold in five years, from 0.57 GB to 3.69 GB at a compound annual growth rate (CAGR) of 59.45 per cent, ie, the size is more than doubling every two years, which is compatible with estimates from other studies.[39]

3.1.3 Deletion Behaviour

Like in similar studies,[40] our automated mailbox analysis is invisible to already deleted e-mails. This might bias our results since the total number of e-mails is not only dependent on the number of received or sent items but also on the deletion behaviour of the employee. In an effort to limit this effect, we asked the participants of the study to describe their deletion behaviour using a seven point Likert-Type rating scale ranging from 1 (I do not delete e-mails) to 7 (I delete e-mails). Figure 1 depicts an overview of the participants’ responses in a scatter plot where the x-axis represents the given answers on the rating scale, and the y-axis represents relative amount of archived (column J). As illustrated in Figure 1, one can obviously distinguish at least two opposite groups of employees, depicted as A and B in Figure 2. Those groups significantly differ in the amount of retained e-mails (t(11) = 6.35, p < 0.001, r = 0.79). Group A retains 974.48 e-mails on average (SE = 134.16) and claims rather not to delete e-mails. Group B rather tends to delete e-mails and retains 153.26 e-mails on average (SE = 36.12). This general tendency is also supported by an inverse correlation between the amount of retained e-mails and the deletion behaviour (Spearman’s ρ = -0.615, p < 0.01).

2013_901.jpg

Figure 1: Claimed Deletion Behaviour in Relation to Number of Retained Items

3.2 Why Do Employees Retain E-mails?

Based on the analysis of the interview data and literature,[41] we deduced five reasons for e-mail retention. We asked the participants of this study to rate their agreement with those reasons on a Likert-Type scale ranging from 1 (strongly disagree) to 7 (strongly agree).

ID
Description
Mean
SD
Min
Max
RE1
I retain e-mails just because I might need them in the future.
6.25
0.72
5
7
RE2
I retain e-mails to reliably hold on to important information for long periods of time, e.g., project documentations, contracts or descriptions of how to perform a complicated task.
5.65
1.71
2
7
RE3
I retain e-mails to be able to demonstrate justification of my decisions.
5.45
1.63
2
7
RE4
I retain e-mails because it simply takes too long to decide which e-mail to retain and which to delete to make it worth the effort.
4.55
5.00
1
7
RE5
For me retaining e-mails is a way of not getting lagged behind other people.
3.90
3.25
1
7

Table 3: Aggregated Ratings of E-mail Retention Reasons (n = 20)

The reasons including some descriptive statistics about the employee’s ratings are illustrated in Table 3. As can be seen from Table 3, respondents mainly retain e-mails as they assume that they might be useful in the future. Other major reasons are to preserve important information, to justify their decisions and/or because making decisions about disposal or retention is considered a time consuming task. Reason RE1 shows a high degree of agreement across all participants with an average rating of 6.25, a standard deviation (SD) of 0.72 and a minimum rating of 5 while RE2 and RE3 show an average rating of 5.65 and 5.45 respectively as well as a standard deviation of 1.71 and 1.63 respectively. RE4 receives an average rating of 4.55 with a standard deviation of 5.00. RE5 shows a lower level of agreement across participants with mean ratings of 3.90 and a standard deviation of 3.25. Besides the given reasons for retention, it has to be noted that we observed that employees perceived storage to be a generally inexpensive and unlimited resource one can easily make use of. For example, participant R15 stated: ‘Why not [keep] all my e-mails? Storage prices have fallen and are not a big deal nowadays.’

4 Interpretation of Findings and Propositions

Aggregating our key findings from the field data in response to our research questions we suggest a set of propositions presented in Table 4. Firstly, the quantitative mailbox data shows that the relative amount of archived e-mails broadly varies between users, ie, by a factor of 52 in a range between 15 and 1,568 retained e-mails. While interpreting Figure 1, it might not be surprising that users claiming to delete e-mails, significantly retain less e-mails than users that claim to abstain from deleting. Rather, it has to be questioned how those contrasting behaviours towards retention/deletion can be explained. Characterising the two revealing opposites (group A, B in Figure 1) there seems to be a type of employee that tends to hoard e-mails retaining them in a ‘keep everything forever’ manner (average archived e-mails: 974). On the contrary we find group B that rather tends to discard e-mails retaining only a subset in a more selective way (average archived e-mails: 153). This observation also finds support in existing research.[42] However, acknowledging this difference in the light of many similarities, such as a general lack of an e-mail retention policy, the use of the same e-mail client and server quota indicates that a decision towards retention or deletion is contingent on behavioural factors. Bearing in mind the issues of compliance, the notion of e-mail as a smoking gun and potential fines for firms, this is an important finding and thus leads to proposition P1.

Secondly, it should be noted that 11 employees show no difference between the mailbox age and the term of employment, which implies that they retain e-mails dating back to their first working days within the firm. While some of the differences can be explained by reasons such as loss of email or spring-cleaning (group A), we may acknowledge a fundamental tendency. It seems that once an employee has made the decision to retain an e-mail, this decision is not revised by re-evaluating the e-mail at a later point in time. This indication is also supported by the high CAGR of 59.45 per cent. In support of extant research confirming that people prefer to keep their options open and that they are rather averse to change their retention decisions, we formulate P2.[43] Thirdly, we observe vast amounts of old e-mails being retained. On average, 86 per cent of a user’s mailbox size is attributed to archived e-mails that are additionally backed up regularly by 90 per cent of the participants. Roughly projecting this sample’s average mailbox size (~ 4 GB) to the total population of employees in the firm (which we estimate to be 55 000), we estimate a total volume of 213.9 terabyte (TB) accounting for the storage of e-mails. This amount of information has to be stored, managed and possibly reviewed in the course of e-discovery requests amidst an archive growth of roughly 60 per cent annually leading to an ever-accumulating storage need. It is acknowledged that such growth drives both the complexity of information management and the overall cost of e-mail. And despite falling storage prices, the strong growth rates are not compensated.[44] According to a recent study, 40 per cent of total e-mail costs can be attributed to storage and archiving.[45] In addition, we find several reports citing the high costs associated with e-discovery. Depending on corporate practices and how well firms are prepared, the analysis of e-mails in audits or legal investigations may run into millions.[46] However, our field data points towards a lack of awareness with regards to the impact of one’s individual e-mail retention habits on information management issues, legal risks and costs. During the interviews, employees seemed to be free of concerns regarding the impact of their individual retention behaviour on a corporate level. This finding is also reflected in the identified retention reasons provided in Table 3 and point towards the need to educate and inform employees about the issues associated with e-mail retention. The high mean ratings (and low SD) of the first three retention reasons in Table 3 point towards a motivation to retain e-mails which is rather intrinsic in nature. As a consequence, we formulate proposition three (P3).

Fourthly, we draw a parallel from the area of information security, where employees are considered a valuable resource in achieving information security in alignment with business objectives. Transferring this concept to our context, such an alignment requires an understanding of the e-mail’s content, its relative value, associated legal and regulatory relevance as well as potential reuse opportunities across other business processes. As such, the appropriateness of an exclusively technological solution can be questioned.[47] Despite the benefits of technological measures, information security research emphasises the importance of formal and informal control mechanisms, such as policies, organisational culture, training or awareness.[48] Based on the premise that individual user behaviour and socio-organisational measures are also important in the context of e-mail retention and acknowledging the lack of an e-mail retention policy in the firm under study, we formulate P4.

ID
Proposition
Exemplary Quote from Study Participants
Literature
P1
A decision towards retention is contingent on behavioural factors in the absence of a binding guideline for the retention of e-mails.
C11: ‘I only retain e-mails that are important for my current task. Once I have completed it, I delete the related e-mails. Otherwise it is getting too much.’
C19: ‘I do not delete e-mails at all. I just keep everything. I think every e-mail has its purpose.’
Gwizdka[49]
P2
Once the decision towards retention is made by the employee, a revised decision towards deletion is rather unlikely.
C01: ‘When I decide to retain an e-mail, I usually do not revise this decision later on. It takes time and is not worth the effort.’
Whittaker[50]
P3
Employees are not aware of the implications of their individual retention behaviour and the associated impact on cost and risk on firm-level.
R15: ‘Why not keeping all my e-mails? Storage prices have fallen and are not a big deal nowadays.’
Sanchez[51]
P4
Firms can improve the employee’s retention behaviour by the means of socio-organisational measures such as policies or trainings.
R05: ‘I think that trainings that provide concrete and tangible guidelines for email retention ... eg, based on the description of real-life scenarios, would be helpful.’
Ward et al and Knolmayer et al.[52]

Table 4: Propositions Explaining Employee E-Mail Retention Behaviour

5 Conclusions

5.1 Implications for Practice and Scientific Community

The present study was designed to explore how/why employees retain e-mails and to elaborate on the implications for firms. We employed a combination of qualitative and quantitative data collection methods to gather data from 20 employees in a major software firm. Connecting the findings from the field in response to our research questions, we suggest a set of propositions that highlight our key findings. The findings offer practical implications for firms that use e-mail and, thus, are faced with a constantly growing amount of e-mails. Our findings provide empirical support that a decision towards retention or disposal is contingent on behavioural factors in the absence of any corporate guidance. Moreover, our results indicate a lack of awareness of the associated effects of each individual’s e-mail retention behaviour with regards to legal risks or costs on a corporate level. However, the way firms retain, manage and retrieve information will impact their risk exposure and legal costs.[53] Thus, acknowledging the findings of this study may help firms to improve their e-mail retention procedures by initiating appropriate measures which should not be of a technological nature only. Also, socio-organisational measures should be considered in order to impart awareness of the impact of individual retention behaviour and to promote conscious decision making regarding e-mail retention by which a ‘keep everything forever’ culture can be avoided.

On the other hand, this study has some general implications for the research community by contributing to the body of knowledge on e-mail retention within a firm. Our findings indicate the existence of different types of behavioural patterns in e-mail retention and deletion among employees. This study thus provides motivation for further research geared towards the examination of the cognitive and environmental factors to provide a better understanding of the determiners of this behaviour. Further, our study shows that there is little extant research that has investigated the issues organisations face with e-mail and information retention from a compliance or legal point of view. Moreover, organisations are obliged to retain specific information in an effort to ensure compliance which seems to become more complex between the tension of rising legal requirements and exponential data growth. As such, this study points to a number of research opportunities on e-mail retention from an information governance perspective.

5.2 Limitations and Future Research

As with every study, this study has some limitations that should be noted when interpreting the findings. Firstly, this exploratory case study was conducted in two countries in Europe but only one firm, which raises questions about the external validity, ie, generalisability of our findings.[54] Although we collected and analysed data from several employees, statistical generalisation is impossible to achieve with 20 units. The main reason for the relatively small sample size used in this study was due to difficulties in recruiting research participants owing to their concerns about data privacy and data loss. This issue is also reported in extant research.[55] Many respondents refused to participate as they perceived their inbox as a very personal and confidential repository of business and personal information. Although we tried to convince employees to participate by showing them examples of result-reports of mailboxes to prove that only anonymous data was extracted, they were nonetheless uneasy. Other respondents refused to participate because they were anxious that they would lose access to their e-mails due to the macro installation. Nevertheless, it should be noted that case research should be judged on its theoretical generalisability and as such differs from sampling research that aims at a statistical generalisability of its findings.[56] On the other hand, acknowledging the problems in acquiring participants, our findings may be subject to non-response bias.[57] Although the comparison of participants suggests that the findings and patterns hold true for the other employees of the firm, including the non-respondents/participants in this study may have provided additional insights and a more complete understanding of employee retention behaviour. This is because these employees represent a significant constituent of the overall population of interest. Moreover, there may be cultural or structural influences that vary across different firms, industries and countries that need to be taken into account when interpreting our results. For example, different litigation systems and litigation cultures may impact the way firms and employees manage e-mail retention. Examining a rather small or mid-sized firm that does not operate globally or that is it not required to comply with SOX requirements may provide additional insights into an employee’s retention behaviour and the external factors that influence this behaviour. Also, investigating further firms in different industries with, eg, different e-mail clients, mailbox quotas or policies provides an interesting opportunity for future research to uncover similarities/differences among employee’s behaviour.

Secondly, we were only able to collect the mailbox data at one specific date. Although we collected a large set of 718,783 e-mail records ranging back to 1999, we still lack a more dynamic view for understanding both an employee’s retention behaviour over time and when retention decisions are made.

Thirdly, we lack a deeper understanding of differences and commonalities between different types of employees including the decision making rationales in e-mail retention that, for example, could support the development of a taxonomy of different user behaviours. Toward that end, additional qualitative in-depth case studies could be conducted to identify antecedents and cognitive factors influencing an employee’s e-mail retention behaviour. In a subsequent step, scholars could take a more positivist research approach to operationalise relevant constructs influencing an employee’s retention behaviour and apply more quantitative research designs. In this effort, hypotheses should be developed and tested, eg, through experiments or large-scale surveys to assist the development of a ‘theory of explaining’[58] on the behaviour of e-mail retention of employees. Therefore, future research should be conducted with larger and diversified samples from various organisations from different industry sectors and geographical regions to allow statistical generalisation and to increase the external validity of the findings. While the findings of the present study should be viewed within the light of the described limitations, they have nevertheless yielded preliminary insights about how and why employees retain e-mails. We thus believe this study provides some useful, however tentative, findings that should be of interest to both scholars and practitioners.


[*] University of Osnabrueck, Institute of Information Management and Information Systems, Katharinenstr. 1, 49069 Osnabrueck, Germany, <dburda@uni-osnabrueck.de>.

[**] University of Osnabrueck, Institute of Information Management and Information Systems, Katharinenstr. 1, 49069 Osnabrueck, Germany, <frank.teuteberg@uni-osnabrueck.de>.

[1] Mimecast, The Shape of Email - Research Report (2012); Judith Ramsay and Karen Renaud, ‘Using Insights from E-mail Users to Inform Organisational E-mail Management Policy’ (2012) 31(6) Behaviour & Information Technology 587.

[2] Linda Volonino, ‘Electronic Evidence and Computer Forensics’ (2003) 12 Communications of the Association for Information Systems 3.

[3] Association of Records Managers & Administrators (ARMA), ‘Study: E-Discovery Not Limited to E-Mail’ (2012) 46(1) Information Management 12.

[4] Amanda Bronstad, Feds Preparing to Sue Standard & Poor’s Over Pre-crash Ratings (4 February 2013) The National Law Journal <http://at.law.com/DxHifq> .

[5] Joe Mullin, Oracle Tells Jury ‘You Can‘t Just Step on Somebody‘s Intellectual Property‘ (17 April 2012) Ars Technica <http://arstechnica.com/tech-

policy/2012/04/oracle-tells-jury-you-cant-just-step-on-somebodys-intellectual-property/>.

[6] EEI, ‘ING Firms Settle Email Retention Case’ (11 February 2013) Compliance Reporter 47.

[7] Reuters, Morgan Stanley Offers $15M Fine for E-Mail Violations (2006) Computerworld

<http://www.computerworld.com/s/article/108687/Morgan_Stanley_offers_15M_fine_for_e_mail_violations> .

[8] Nancy Flynn, The E-policy Handbook: Rules and Best Practices to Safely Manage your Company's E-mail, Blogs, Social Networking, and Other Electronic Communication Tools (Amacon, 2009).

[9] Sheila Childs, Kenneth Chin and Debra Logan, Magic Quadrant for Enterprise Information Archiving (2011).

[10] Vijay Khatri and Carol V Brown, ‘Designing Data Governance’ (2010) 53(1) Communications of the ACM 148; Gerhard F Knolmayer et al, ‘E-mail Governance: Are Companies in Financial Industries More Mature?’ (Paper presented at the 45th Hawaii International Conference on System Sciences, 2012).

[11] Joseph G Davis and Shayan Ganeshan, ‘Aversion to Loss and Information Overload: An Experimental Investigation’ (Paper presented at the International Conference on Information Systems (ICIS) 2009).

[12] Burke T Ward et al, ‘Recognizing the Impact of E-Discovery Amendments on Electronic Records Management’ (2009) 26(4) Information Systems Management 350.

[13] Elizabeth Lomas, ‘Information Governance: Information Security and Access Within a UK Context’ (2010) 20(2) Records Management Journal 182; Whitepaper: The Disconnect Between Legal and IT Teams (2009) Waterford Technologies <http://www.waterfordtechnologies.com/wp-content/uploads/2012/11/4.102.1-WHITE-PAPER-Disconnect_Legal_and_IT-DontKnows.pdf> .

[14] Knolmayer et al, above n 10.

[15] Volonino, above n 2.

[16] Ward et al, above n 12, 351.

[17] John C Ruhnka and John W Bagby, ‘Using ESI Discovery Teams to Manage Electronic Data Discovery’ (2010) 53(7) Communications of the ACM 142; Linda Volonino, Janice Sipior and Burke T Ward, ‘Managing the Lifecycle of Electronically Stored Information’ (2007) 24(3) Information Systems Management 231.

[18] Linda Volonino, Guy H Gessner and George F Kermis, ‘Holistic Compliance with Sarbanes-Oxley’ (2004) 14(1) Communications of the Association for Information Systems 219.

[19] Childs, Chin and Logan, above n 9; Ted Schadler, Should Your Email Live in the Cloud? A Comparative Cost Analysis (2009).

[20] For a comprehensive review, see, eg, Steve Whittaker, ‘Personal Information Management: From Information Consumption to Curation’ (2011) 45 Annual Review of Information Science and Technology 3.

[21] Steve Whittaker and Candace Sidner, ‘E-mail Overload: Exploring Personal Information Management of E-mail’ (Paper presented at the Conference on Human Factors in Computing Systems, Vancouver, 1996).

[22] Danyel Fisher et al, ‘Revisiting Whittaker & Sidner’s E-mail Overload Ten Years Later’ (Paper presented at the 20th Anniversary Conference on Computer Supported Cooperative Work, Banff Alberta, 2006).

[23] Deborah Barreau, ‘The Persistence of Behavior and Form in the Organization of Personal Information’ (2007) 59(2) Journal of the American Society for Information Science and Technology 307; Jacek Gwizdka, ‘Email Task Management Styles: The Cleaners and the Keepers’ (Paper presented at the CHI’04 Conference on Human Factors in Computing Systems, Vienna, 2004).

[24] Laura A Dabbish and Robert E Kraut, ‘Email Overload at Work: An Analysis of Factors Associated with Email Strain’ (Paper presented at the 20th Anniversary Conference on Computer Supported Cooperative Work, Banff Alberta, 2006).

[25] Nicolas Ducheneaut and Victoria Bellotti, ‘E-mail as Habitat: An Exploration of Embedded Personal Information Management’ (2001) 8(5) Interactions 30.

[26] Izak Benbasat, David K Goldstein and Melissa Mead, ‘The Case Research Strategy in Studies of Information Systems’ (1987) 11(3) MIS Quarterly 369, 369.

[27] Anna Madill, Abbie Jordan and Caroline Shirley, ‘Objectivity and Reliability in Qualitative Analysis: Realist, Contextualist and Radical Constructionist Epistemologies’ (2000) 91(1) British Journal of Psychology 1.

[28] Kathleen M Eisenhardt, ‘Building Theories from Case Study Research’ (1989) 14(4) Academy of Management Review 532; Robert K Yin, Case Study Research: Design and Methods (Sage, 2009).

[29] Guy Paré, ‘Investigating Information Systems with Positivist Case Study Research’ (2004) 13(1) Communications of the Association for Information Systems 233; Yin, above n 28.

[30] Sarbanes-Oxley Act of 2002, Pub L No 107–204, 116 Stat 745 (2002)

<http://www.sec.gov/about/laws/soa2002.pdf> (‘SOX’).

[31] Yin, above n 28.

[32] Bonnie Kaplan and Dennis Duchon, ‘Combining Qualitative and Quantitative Methods in Information Systems Research: A Case Study’ (1988) 3(3) MIS Quarterly 571.

[33] Ashish Gupta et al, ‘E-mail Management: A Techno-Managerial Research Perspective’ (2006) 17(1) Communications of the Association for Information Systems 941.

[34] Ibid.

[35] Kaplan and Duchon, above n 32.

[36] Paré, above n 29, 249.

[37] Cathy Urquhart, ‘An Encounter with Grounded Theory: Tackling the Practical and Philosophical Issues’ in Eileen M Trauth (ed), Qualitative Research in IS: Issues and Trends (Idea Group, 2001) 104.

[38] Fisher et al, above n 22.

[39] IDC, The 2011 IDC Digital Universe Study (2011).

[40] Fisher et al, above n 22.

[41] Angela Edmunds and Anne Morris, ‘The Problem of Information Overload in Business Organisations: A Review of the Literature’ (2000) 20(1) International Journal of Information Management 17; Ronald L Thompson, Christopher A Higgins and Jane M Howell, ‘Personal Computing: Toward a Conceptual Model of Utilization’ (1991) 15(1) MIS Quarterly 125; Ron Weber, ‘The Grim Reaper: The Curse of E-mail’ (2004) 28(3) MIS Quarterly iii.

[42] Gwizdka, above n 23.

[43] Davis and Ganeshan, above n 11; Whittaker, above n 20.

[44] Ward et al, above n 12.

[45] Schadler, above n 19.

[46] Daniel E Braswell and W Ken Harmon, ‘Assessing and Preventing Risks from E-mail System Use’ (2003) 5 Information Systems Control Journal 33.

[47] Burcu Bulgurcu, Hasan Cavusoglu and Izak Benbasat, ‘Information Security Policy Compliance: An Empirical Study of Rationality-Based Beliefs and Information Security Awareness’ (2010) 34(3) MIS Quarterly 523.

[48] Tejaswini Herath and H Raghav Rao, ‘Protection Motivation and Deterrence: A Framework for Security Policy Compliance in Organisations’ (2009) 18(2) European Journal of Information Systems 106.

[49] Gwizdka, above n 23.
[50] Whittaker, above n 20.
[51] Anthony Sanchez, ‘Top 5 Strategic Email Compliance Mistakes’ (2005) Sarbanes-Oxley Compliance Journal
<http://www.s-ox.com/dsp_getFeaturesDetails.cfm?CID=843> .
[52] Ward et al, above n 12; Knolmayer et al, above n 10.

[53] Volonino, Sipior and Ward, above n 17.

[54] Yin, above n 28.

[55] Gupta et al, above n 33.

[56] Bas Hillebrand, Robert AW Kok and Wim G Biemans, ‘Theory-testing Using Case Studies: A Comment on Johnston, Leach, and Liu’ (2001) 30(8) Industrial Marketing Management 651.

[57] J Scott Armstrong and Terry Overton, ‘Estimating Nonresponse Bias in Mail Surveys’ (1977) 14 Journal of Marketing Research 396.

[58] Shirley Gregor, ‘The Nature of Theory in Information Systems’ (2006) 30(3) MIS Quarterly 611.


AustLII: Copyright Policy | Disclaimers | Privacy Policy | Feedback
URL: http://www.austlii.edu.au/au/journals/JlLawInfoSci/2013/9.html