106. AWS ElastiCache 01 - qyjohn/AWS_Tutorials GitHub Wiki
AWS ElastiCache offers managed service for both Memcached and Redis. In this training, we will use Memcached as examples.
(1) Memcached on EC2
If you want to install Memcached on Ubuntu 16.04, the following commands will just work:
$ sudo apt-get update
$ sudo apt-get install memcached
$ sudo service memcached restart
If you want to make sure that Memcached is listening on its default port (11211), you can use the telnet command to do a quick test.
$ telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
^]quit
telnet> quit
Connection closed.
By default, Memcached listens on localhost (127.0.0.1) only. Therefore, if you attempt to connect to Memcached using the private IP address of the EC2 instance, the connection will be refused. In the example below, 172.31.5.49 is my EC2 instance with Memcached install, and I am doing the telnet from the same EC2 instance.
$ telnet 172.31.5.49 11211
Trying 172.31.5.49...
telnet: Unable to connect to remote host: Connection refused
If you want to set up a Memcached server that can be accessed from other EC2 instance in your VPC, you will need to modify the following line in /etc/memcached.conf, changing 127.0.0.1 to the private IP address of the EC2 instance. It should be noted that the security group on the EC2 instance should allow the inbound traffic on port 11211 from your Memcached clients.
# Specify which IP address to listen on. The default is to listen on all IP addresses
# This parameter is one of the only security measures that memcached has, so make sure
# it's listening on a firewalled interface.
-l 127.0.0.1
Obvious you need to restart the Memcached service to make the new configuration work:
$ sudo service memcached restart
ubuntu@ip-172-31-5-49:/etc$ telnet 172.31.5.49 11211
Trying 172.31.5.49...
Connected to 172.31.5.49.
Escape character is '^]'.
^]quit
telnet> quit
Connection closed.
Now we know how to set up a Memcached server. Let's terminate the EC2 instance and learn Memcached using an ElastiCache Memcached instance.
(2) Memcached Basics
Launch an ElastiCache Memcached instance and configure the security group being used on the ElastiCache Memcached instance to allow the inbound connection on port 11211 from your EC2 instance.
$ telnet training.rmzfbh.0001.apse2.cache.amazonaws.com 11211
Trying 172.31.28.227...
Connected to training.rmzfbh.0001.apse2.cache.amazonaws.com.
Escape character is '^]'.
With Memcached, you can use get to retrieve data and delete to delete data. For example, if the value of you key is ABCDFF, you will see the following output:
get key
VALUE key 0 6
ABCDFF
END
delete key
DELETED
You can use set, add, append, prepend, and cas to store data. These commands accept the following parameters :
- <key> : the key of the data stored
- <flags> : 32-bit unsigned integer that the server store with the data (provided by the user), and return along the data when the item is retrieved
- <exptime> : expiration time in seconds, 0 mean no delay, if exptime is superior to 30 day, Memcached will use it as a UNIX timestamps for expiration
- <bytes> : number of bytes in the data block
- <cas unique> : unique 64-bit value of an existing entry (retrieved with gets command) to use with cas command
[noreply] : optional parameter that inform the server to not send the reply
These commands can return :
- STORED to indicate success
- NOT_STORED indicate that the data was not stored because condition for “add” or “replace” command wasn’t met, or the item is in a delete queue
- EXISTS indicate that the item you are trying to store with “cas” command has been modified since last fetch
- NOT_FOUND indicate that the item did not exist or has been deleted
The format of these commands are listed below. As you can see, the value for the key is provided after the first RETURN.
set <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n
add <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n
replace <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n
append <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n
prepend <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n
cas <key> <flags> <exptime> <bytes> <cas unique> [noreply]\r\n
- Set - Store key/value pair in Memcached
set key 0 900 4
data
STORED
- Add - Store key/value pair in Memcached, but only if the server doesn’t already hold data for this key
add key 0 900 4
abcd
NOT_STORED
add key2 0 900 4
abcd
STORED
- Replace - Store key/value pair in Memcached, but only if the server already hold data for this key
replace key 0 900 4
ABCD
STORED
replace key2 0 900 4
ABCD
STORED
replace key3 0 900 4
ABCD
NOT_STORED
- Append - Add value to an existing key after existing data. Append does not take or parameters but you must provide them!
append key 0 900 2
FF
STORED
get key
VALUE key 0 6
ABCDFF
END
If you need to quit the Memcached telnet interface, simply use the quit command.
quit
Connection closed by foreign host.
As you can see, Memcached is an in-memory key-value store and you can store, retrieve, delete data using simple commands.
(3) Using Memcached with Java
To use Memcached with Java, you will need a Memcached client for Java. A very popular Memcached client for java is spymemcached. You will need Apache ant to build spymemcached.
$ cd ~
$ sudo apt-get install ant
$ git clone https://github.com/couchbase/spymemcached
$ cd spymemcached
$ ant
In the spymemcached/build/jars, you will find spymemcached-2.12.2.jar (the version number might be different). You will need to add this JAR to you CLASSPATH:
$ echo $CLASSPATH
/home/ubuntu/aws-java-sdk-1.11.98/lib/aws-java-sdk-1.11.98.jar:/home/ubuntu/aws-java-sdk-1.11.98/third-party/lib/*:/home/ubuntu/apache-log4j-1.2.17/log4j-1.2.17.jar:.
$ export CLASSPATH=$CLASSPATH:~/spymemcached/build/jars/spymemcached-2.12.2.jar
$ echo $CLASSPATH
/home/ubuntu/aws-java-sdk-1.11.98/lib/aws-java-sdk-1.11.98.jar:/home/ubuntu/aws-java-sdk-1.11.98/third-party/lib/*:/home/ubuntu/apache-log4j-1.2.17/log4j-1.2.17.jar:.:/home/ubuntu/spymemcached/build/jars/spymemcached-2.12.2.jar
Now let's use the following Java code to store some data in Memcached:
import java.net.InetSocketAddress;
import java.util.concurrent.Future;
import net.spy.memcached.MemcachedClient;
public class MemcachedSet
{
public static void main(String[] args)
{
try
{
// Connecting to ElastiCache Memcached instance
MemcachedClient mcc = new MemcachedClient(new InetSocketAddress("training.rmzfbh.0001.apse2.cache.amazonaws.com", 11211));
System.out.println("Connection to server sucessful.");
// now set data into memcached server
Future fo = mcc.set("TestKey", 900, "This is a test message");
// print status of set method
System.out.println("set status:" + fo.get());
// retrieve and check the value from cache
System.out.println("TestKey value in cache - " + mcc.get("TestKey"));
// Shutdowns the memcached client
mcc.shutdown();
} catch(Exception ex)
{
System.out.println( ex.getMessage() );
}
}
}
Then we compile and run the program:
$ javac MemcachedSet.java
$ java MemcachedSet
2017-03-26 10:31:40.286 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=training.rmzfbh.0001.apse2.cache.amazonaws.com/172.31.28.227:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
Connection to server sucessful.
set status:true
TestKey value in cache - This is a test message
2017-03-26 10:31:40.330 INFO net.spy.memcached.MemcachedConnection: Shut down memcached client
We are not going to discuss how you can work with each and every commands in Memcached. You should refer to the spymemcached Java API Docs if you want to explore further on this topic. What we are interested in is how we can use Memcached to work with a relational database such as MySQL.
Again, we use a configuration file db.properties to store the connection details for your ElastiCache Memcached instance and your RDS instance. Here we use the RDS instance with 100 million records you created in your 105. AWS RDS 01 training as the data source.
mc_hostname=dns-endpoint-of-elasticache-memcached-instance
db_hostname=dns-endpoint-of-rds-instance
db_username=username
db_password=password
db_database=database_name
Below is the example code we use for our demo. The example code takes one command line parameter, which is the firstname being used in the query. For example, if the firstname is "Jason", the test() method makes a query "SELECT * FROM names WHERE firstname = 'Jason'" and prints out the result. The logic of the test() methods is (a) look into the cache to see if there is a cached version; (b) if no cached version is available do a query to the RDS instance and write the query result into the cache; and (c) prints out the query result.
Run this code using your RDS instance and ElastiCache instance and compare the performance difference.
import java.io.*;
import java.net.*;
import java.sql.*;
import java.util.*;
import java.util.concurrent.*;
import java.security.*;
import net.spy.memcached.*;
import javax.sql.rowset.*;
public class MemcachedJDBC
{
Connection conn;
MemcachedClient mcc;
public MemcachedJDBC()
{
try
{
// Getting database properties from db.properties
Properties prop = new Properties();
InputStream input = new FileInputStream("db.properties");
prop.load(input);
String mc_hostname = prop.getProperty("mc_hostname");
String db_hostname = prop.getProperty("db_hostname");
String db_username = prop.getProperty("db_username");
String db_password = prop.getProperty("db_password");
String db_database = prop.getProperty("db_database");
// Load the MySQL JDBC driver
Class.forName("com.mysql.jdbc.Driver");
String jdbc_url = "jdbc:mysql://" + db_hostname + "/" + db_database + "?user=" + db_username + "&password=" + db_password;
// Connecting to RDS MySQL instance
conn = DriverManager.getConnection(jdbc_url);
// Connecting to ElastiCached Memcached instance
mcc = new MemcachedClient(new InetSocketAddress(mc_hostname, 11211));
} catch (Exception ex)
{
System.out.println( ex.getMessage() );
}
}
public String MD5encode(String input)
{
StringBuffer hexString = new StringBuffer();
try
{
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] hash = md.digest(input.getBytes());
for (int i = 0; i < hash.length; i++)
{
if ((0xff & hash[i]) < 0x10)
{
hexString.append("0" + Integer.toHexString((0xFF & hash[i])));
}
else
{
hexString.append(Integer.toHexString(0xFF & hash[i]));
}
}
} catch (Exception ex)
{
System.out.println( ex.getMessage() );
}
return hexString.toString();
}
public void test(String firstname)
{
try
{
// Build the SQL query
String query = "SELECT * FROM names WHERE firstname = '" + firstname + "'";
String query_md5 = MD5encode(query);
System.out.println(query_md5);
Future<Object> f = mcc.asyncGet(query_md5);
// Check if the query result is already in cache
Object result=f.get();
CachedRowSet crs = RowSetProvider.newFactory().createCachedRowSet();
if (result == null)
{
System.out.println("\n----\nFetching data from database\n----\n");
ResultSet rs = conn.createStatement().executeQuery(query);
crs.populate(rs);
mcc.set(query_md5, 60, crs);
while (crs.next())
{
System.out.print(crs.getString("firstname") + "\t" + crs.getString("lastname") + "\n");
}
rs.close();
}
else
{
System.out.println("\n----\nFetching data from cache\n----\n");
crs = (CachedRowSet) result;
crs.beforeFirst();
while (crs.next())
{
System.out.print(crs.getString("firstname") + "\t" + crs.getString("lastname") + "\n");
}
}
} catch(Exception ex)
{
System.out.println( ex.getMessage() );
}
}
public void done()
{
try
{
conn.close();
mcc.shutdown();
} catch(Exception ex)
{
System.out.println( ex.getMessage() );
}
}
public static void main(String[] args)
{
// Create an instance of the MemcachedJDBC class
MemcachedJDBC mj = new MemcachedJDBC();
// Print out current timestamp
java.sql.Timestamp t0 = new java.sql.Timestamp(System.currentTimeMillis());
System.out.println(t0 + "\n");
// Do a query
mj.test(args[0]);
// Print out current timestamp again
java.sql.Timestamp t1 = new java.sql.Timestamp(System.currentTimeMillis());
System.out.println("\n" + t1);
// Shutdown everything
mj.done();
}
}
It should be noted that a cached version has its expiration time. In this example, we set the expiration time to be 60 seconds. If during the records in the database is updated before the cached version expires, your application will still get the cached version. As such, it is important to have a cache invalidation method, if you want to always show the latest result. This is done by deleting the cached version after inserting new records or updating old records in your database. Assuming that the INSERT and UPDATE queries are also executed by the same Java application, you should try to write such a cache invalidation method as your homework.
If the database is being modified from outside of your application, your application has no way of knowing such modifications.
(4) PHP Session Sharing with Memcached
In our 104. AWS S3 01 training, we created the following PHP code to upload images to your S3 bucket. When we deploy this PHP code to multiple EC2 instances behind an ELB, we notice that the session information on the two EC2 instances are not the same.
<?php
$server = $_SERVER['SERVER_ADDR'];
session_start();
$session_id = session_id();
if (!isset($_SESSION['marker']))
{
$_SESSION['marker'] = $server . ' - ' . time();
}
require 'aws.phar';
$s3 = new Aws\S3\S3Client(['version' => 'latest','region' => 'ap-southeast-2']);
$bucket = '331982-training';
?>
<HTML>
<Head>
<title>Simple S3 Demo</title>
</Head>
<body>
<H1><?php echo $server;?></H1>
<H3><?php echo 'Session ID: ' . $session_id;?></H3>
<H3><?php echo 'Session Marker: ' . $_SESSION['marker'];?></H3>
<HR>
<form action='upload.php' method='post' enctype='multipart/form-data'>
<input type='file' id='fileToUpload' name='fileToUpload' id='fileToUpload''>
<input type='submit' value='Upload' id='submit_button' name='submit_button'>
</form>
<?php
if (isset($_FILES["fileToUpload"]))
{
save_upload_to_s3($s3, $_FILES["fileToUpload"], $bucket);
}
echo "<p>";
$result = $s3->listObjects(array('Bucket' => $bucket));
foreach ($result['Contents'] as $object)
{
echo $object['Key'] . "<br>";
}
function save_upload_to_s3($s3_client, $uploadedFile, $s3_bucket)
{
try
{
// Upload the uploaded file to S3 bucket
$key = $uploadedFile["name"];
$s3_client->putObject(array(
'Bucket' => $s3_bucket,
'Key' => $key,
'SourceFile' => $uploadedFile["tmp_name"],
'ACL' => 'public-read'));
echo "Upload successful<br>";
} catch (S3Exception $e)
{
echo "There was an error uploading the file.<br>";
return false;
}
}
?>
<HR>
<footer>
<p align='right'>AWS Tutorials prepared by Qingye Jiang (John).</p>
</footer>
</body>
</HTML>
On all the EC2 instances behind your ELB, edit /etc/php5/apache2/php.ini (Ubuntu 14.04) or /etc/php/7.0/apache2/php.ini (Ubuntu 16.04), make the following modifications:
session.save_handler = memcached
session.save_path = "[endpoint-to-the-elasticache-instance]:11211"
Then you need to restart Apache on both web servers to make the new configuration effective.
$ sudo service apache2 restart
If you are on Ubuntu 16.04, you will need to install the php-memcached package to make things work. If you do not install this package, you will see some error message in your /var/log/apache2/error.log similar to the following:
[Mon Mar 27 04:01:04.960120 2017] [:error] [pid 6115] [client 172.31.16.102:6136
7] PHP Warning: session_start(): Cannot find save handler 'memcached' - session
startup failed in /var/www/html/upload.php on line 3
On Ubuntu 16.04, install the php-memcached package and restart apache.
$ sudo apt-get install php-memcached
$ sudo service apache2 restart
Now, refresh your browser and you will see that when your HTTP requests are served by different EC2 instances behind the ELB, the session information are the same. This indicates that you have successfully set up session sharing.
(5) Homeworks
Modify the above-mention upload.php, with the following features:
-
When a user uploads a picture, rename it with a uuid and store it onto S3. For example, if a user uploads cloud.jpg for the first time, you have something like s3://bucket_name/923ae95a-12a3-11e7-bdbc-022d84b795f3.jpg in your S3 bucket. If a user uploads another cloud.jpg, you have something like s3://bucket_name/acd7ad3e-12a3-11e7-a3fb-022d84b795f3.jpg in your S3 bucket. The benefits for this approach include (a) two files with the same filename might contain different content; (b) we want to better track who uploads what; (c) we don't want people to guess (and iterate) all the filenames we have.
-
For each successful user upload, write a record into a table in your RDS instance, with (at least) the following information: (a) current timestamp, (b) source IP address, (c) browser type, (d) browser version, (e) original filename, (f) filename on S3.
-
Create an additional index.php, showing the latest N uploaded images (with records in your RDS instance). Modify your upload.php to redirect to index.php when the uploaded file is successfully stored on S3.
-
Use ElastiCache as the cache layer, so that you do not need to perform unnecessary queries to your RDS instance. Make sure that you implemented cache invalidation in your upload.php.