Wednesday, October 20, 2010

Clustering and Load Balancing With tc Server and ERS httpd #s2gx

Mark Thomas - SpringSource
  • Tomcat committer
  • tc Server developer
  • responsible for keeping tc Server and Tomcat in sync
    • memory leak detection in tomcat manager app
    • recent logging improvements
    • simplifying jmx access
    • all of the above started in tc Server, but have been contributed back and implemented these features in tomcat
    • don't want to get into having a significant fork of tomcat
Typical Architectures
  • load balancer (round robin) -> httpd (sticky sessions) -> tc Server (clustered)
    • don't go anywhere near tc Server clustering unless you absolutely have to--adds complexity and overhead
    • only thing tc Server clustering gives you is the ability for users not to lose sessions if an instance of tomcat goes down
    • ask yourself how big of a deal it is if your users lose their sessions when an outage occurs--if it's a big deal then you may need clustering
Starting Point
  • ubuntu 8.04.4 64-bit VM
  • vmware tools installed
  • 64-bit sun jdk 1.6.0_21
  • will be installing tc Server, Hyperic, etc. on this clean image
tc Server Installation
  • don't run tc Server as root
  • create a tcserver user
    • owns the tc Server files
    • runs the tc Server processes
  • install to /usr/local/tcserver
Instance Naming and Port Numbering
  • think about this in advance--may wind up with 100s of instances
  • tc01, tc02, etc. as the instance name, then follow this for ports
  • example scheme for ports
    • 1NN80 - http
    • 1NN43 - https
    • 1NN09 - ajp
    • 1NN05 - shutdown (if used)
    • 1NN69 - jmx
  • server and jvmRoute naming--consider linking server name to IP address, e.g. srvXXX-tcYY where XXX is the end of the IP address, YY is the tomcat instance number
    • 1NN20 - cluster communication
DEMO: Installing tc Server
  • tc Server version names are e.g. apache-tomcat-6.0.29.A.RELEASE where the first part is the version of Tomcat, the "A" means it's the first release of tc Server based on that tomcat release
  • if shutdown port is disabled, doing a kill -15 does a graceful shutdown. kill -9 works too and tomcat won't care, though your application might, so only do -9 if you have to
  • created two instances of tc Server using the tc Server create instance script
  • tc Server comes with templates for startup scripts--copy these over to /etc/init.d and edit as needed
  • paramterize cluster addresses and ports in a catalina properties file
  • can use ${...} notation in server.xml to hit the properties in
Creating a Cluster
  • switching to static node membership
    • cumbersome for large clusters
    • remove the <Membership .../> element
    • need to add a bunch of config stuff after the <Interceptor .../> elements
  • easier to use dynamic node discovery
  • backup strategies -- tomcat gives you DeltaManager and BackupManager
    • delta manager is simplest--replicates every session to every node in the cluster
    • if your sessions use a lot of memory, delta manager doesn't give you much scalability
    • if your limitation is CPU, delta manager gives you some scalability
    • amount of network traffic on delta manager increases with the square of the number of nodes--not terribly scalable
  • backup manager
    • replicates session data to one other node in the cluster
    • send options: synchronous vs. asynchronous
      • in synchronous, writes session changes to other nodes, waits for acknowledgement, and then sends response to the user. can mean a lag for the user.
      • asynchronous -- changes to sessions are put on a queue and the user gets the response immediately. means there's a chance that the cluster will be in an inconsistent state. use of sticky sessions means the consistency of the cluster doesn't really matter.
      • because java thread running isn't deterministic, in asynchronous mode the session updates may not be processed in the same order in which they were placed on the queue, so if your application depends on these being processed in the same order this is a risk
    • no need for the WAR farm deployer -- hyperic does this better
      • WAR farm deployer has been removed from tc Server
    • backup manager DOES know where the primary and backup nodes ARE for every session
      • i.e. it doesn't actually store all the sessions from all nodes, but it knows where to get the session it lost
    • backup manager scales much better than delta manager in both memory and network traffic
      • network traffic scales linearly with number of nodes
  • for availability on a small cluster, use the delta manager
  • if you're worried about scalability, go with the backup manager
Hyperic HQ Installation
  • create an hqs user
  • hqs user owns the hyperic hq agent files
  • the agent itself runs as the tcserver user
  • os security considerations
    • agent doesn't need root privileges to access OS mechanics, start/stop processes, etc.
    • tc Server needs to be able to read WAR files uploaded via the agent
    • don't want tc Server runtime running as root
  • hyperic security considerations
    • don't want agent connecting as hqadmin super user
    • create a dedicated agent user
    • requires create, modify, and delete privileges for platform and platform services only
ERS httpd
  • ERS = Enterprise Ready Server
  • SpringSource's distribution of Apache httpd
  • install ERS as root
    • httpd processes run as nobody:nobody so this is fine
  • remove the test instance
  • create a new instance
  • module configuration
    • enable mod_proxy_balancer
    • enable mod_proxy_ajp
    • mod_proxy_ajp isn't quite as stable vs. mod_jk and mod_proxy_http
    • mentioned something about mod_http now having remote IP addresses available--need to ask about this
  • configure balancer in ers

<Proxy balancer://tc>
  BalancerMember route=tc01-uniqueID
  BalancerMember route=tc02-uniqueID

ProxyPass /cluster-test balancer://tc/cluster-test stickysession=JSESSIONID:jessionid
ProxyPassReverse /cluster-test balancer://tc/cluster-test

Debugging Clusters

  • need something in your apps that tells you which cluster node you're on
  • also need something to spit out the session ID so you can test that the sticky sessions are working
  • if your context path differs from your host name in tc Server, this may cause your cookies not to work since the hosts are different
    • can use cookiepath in proxypassreverse directive
    • easier: just have your context path match your host name
  • anything you want replicated in sessions has to be serializable
    • if your application can't support having everything in the session be serializable, terracotta will support non-serializable data in session replication

No comments: