1st Mar 2017 by Kurt Garloff

Guide for OTC PreProduction access and usage

Overview

The Open Telekom Cloud (OTC) consists of several environments; the environment that customers normally get to see is the production environment. There is also a staging environment that can be divided into several compartments and is used to test and validate fixes; it is commonly referred to as PreProduction environment, or short PreProd.

Normally, OTC PreProd can only be accessed by T-Systems and Huawei engineering to perform testing. Ingoing access is guarded by a whitelist. In special cases, access can be granted to partners within a project for a limited amount of time.

This document describes how to access PreProd for partners.

Constraints

The PreProd environment has not undergone the full testing nor the full review and approval process of the OTC production environment and thus may expose more bugs and potentially weaker security protection than the Production environment. It is also significantly smaller in size and may thus expose resource limitations.

PreProd exposes features that may not be mature yet or may not have been announced to T-Systems' customers yet. You are expected to feed back your observations to T-Systems (and Huawei), but you must not disclose them to third parties or the public. There is no guarantee that the features you see on PreProd will be released on Prod soon or at all.

The operational processes on PreProd differ significantly from Production; T-Systems may apply changes that break user-visible behavior or does reboots (which affects user VMs) without prior warning.

You also need to be aware that there is a risk that data stored on PreProd may be lost or that resources may be asked to be freed up on short notice. We also ask partners to reduce usage to what is really needed to successfully perform the project.

When accessing PreProd you need to understand those limitations and agree to them.

Access

The PreProd environment allows outgoing access to the internet from VMs with elastic IP addresses; incoming access to the Web Interface ("Service Console"), to the object storage, to the API endpoints, to PreProd documentation and to the VMs is protected by a whitelist mechanism.

There are two options to access the PreProd environment:

Get your IP whitelisted
Use a proxy we provide

Access via whitelisting

If you use a static IP (e.g. your company's outgoing proxy) to access OTC, you may get this whitelisted. Please let us know the reason and duration and the IP address(es) and we can push this to our security team for approval. Please also indicate the persons that get access this way. Note that we will not whitelist large network segments.

Access via proxy

If you do not work from a static IP, whitelisting does not work for you. We operate proxy server (in OTC production) that requires authentication (digest auth) and then forwards your request to the PreProd env. This proxy server is whitelisted and under control of TSI Engineering.

To access the OTC PreProd via proxy, you need to configure 160.44.194.54:8128 as the proxy for your client. The client should ask for your credentials for the domain proxyd and you need to supply the username and password that have been given to you. Web browsers can use the proxy script (proxy.pac) for auto-configuration.

The proxy will only allow access from authenticated clients to the address space of OTC (for historical reasons both Production and PreProd). This makes abovementioned proxy script somewhat handy, because it tells the browser to only use the proxy where needed. The proxy does also allow http[s] connections from Prod to PreProd, which can be useful to access the PreProd API gateway or object storage from a VM in OTC production.

Unfortunately, nc and socat do support only basic proxy auth which we do not offer for security reasons (password goes over the wire in plaintext). Direct ssh access to the VMs is thus not so trivial. We can however create a shell account on the proxy VM when you submit your public ssh key to us. ssh connections are then possible via something like

ssh -o ProxyCommand="ssh PRUSER@160.44.194.54 nc %h %p" VMUSER@EIP

It is convenient to ssh-add the needed keys to your ssh-agent beforehand. Note that you can also add the ProxyCommand to your ssh_config file.

The shell account can also be used to place API calls, as the proxy VM has conveniently installed the recommended API client tools.

Another option for ssh access to your VMs is to use the (IPsec) VPN tunnel features of OTC.

Configuration differences between Prod and PreProd

The public images used in production come from T-Systems' ImageFactory. These have been pre-configured to work very well in OTC production by having a number of things pre-configured:

Xen PV drivers are installed for best performance.
uvp-monitor is running to collect monitoring data for Cloud Eye.
Update repositories are registered to use a local mirror in the OTC public service zone (which is reachable without outgoing internet access).
NTP is configured to use two NTP sources in the OTC public service zone.
DNS will come via DHCP, the default is set in the subnet configuration and there defaults to our DNS server in the public service zone.
cloud-init is set up to do the standard jobs such as expanding the root filesystem to the disk size, getting the hostname, injecting a trusted ssh key etc.

A few things do not work the same way on PreProd:

The default DNS server is not reachable. When creating a subnet, please override the defaults to use 8.8.8.8 and 8.8.4.4 (or other reliable public nameservers at your choice). Unlike the OTC nameserver, the public ones are only reachable by machines that have an EIP or some other way to have outgoing internet access (such as an EIP, SNAT or a NAT instance).
The local mirror servers are not reachable. Go into the repository configuration files (/etc/yum.repos.d/ for RHEL/CentOS and /etc/zypper/repos.d/ for openSUSE/SLES) and enable the commented out upstream locations instead of the mirror server URLs to reenable package updates or installations. Unlike the OTC mirror, access to the public mirrors of course needs outgoing internet access.
The pre-configured local NTP servers are not reachable; replace them by de.pool.ntp.org if you need reliable system time. For these (unlike the local servers), you needs outgoing internet access to work. (Note that the VMs get reliable time from the host system; but over time, this can get out of sync, so NTP is more reliable and thus pre-configured.)
Since Feb 2017, there is the option to create SNAT instances; as a lot more services on PreProd require SNAT, it's especially attractive there. See our SNAT guide.
Private image registration in Prod works by uploading standard image files (e.g. .qcow2 or .vhd) to a bucket in your object storage and then specifying this in the private image registration via the web interface or by using the right sequence of glance and otc magic. Please refer to the OTC documentation.
```
s3 put $YOUR_BUCKET/$IMGFILE.$TYPE filename=/$PATHTO/$IMGFILE.$TYPE
glance image-create --name "$IMAGENAME" --disk-format $TYPE --container-format bare --min-disk $MINDISK --min-ram 1024 --protected false --visibility private --property __os_version="$OSVER"
otc.sh images upload $IMGID "$YOUR_BUCKET:$IMGFILE.$TYPE"
```
In this example replace YOUR_BUCKET with a bucket you own in your Object Store, TYPE with a supported image format (e.g. qcow2, .vmdk or vhd), PATHTO with the directory that your image file is stored in your local filesystem, IMAGENAME the name under which you this image to be listed/displayed, MINDISK with the minimal disk size in GB needed for the root file system of the image, OSVER with a supported OS version according to the table in section A.3 of the Image Management API Reference, and IMGID with the image ID returned by the glance command. Note that you can use glance image-show $IMGID to see the progress of the background image creation initiated with the last command of above sequence and that you can only start VMs from this image after the status has gone from queued to active. (glance does unfortunately not offer a --wait option.)

This mechanism works in PreProd as well (use otc-tools-0.6.20 or later). However, the OpenStack API compatibility has been improved, so that an explicit transfer via object storage buckets is no longer needed. You can use glance image-create followed by a glance image-upload to directly upload .zvhd files as private images. Since Dec 2016, this also works for standard community image formats (such as e.g. .qcow2) which then get converted to (z)vhd in the background by OTC.
The public IP addresses in PreProd are from 80.158.0/23 space, whereas the main pool for Prod currently is 160.44.192/20. This will expand over time ...

API usage

For a general overview of API usage, please see the Telekom Blog article and the DOST presentation. They contain a basic introduction on API usage on OTC, including a recommended working set of python client tools. To get a working set of client tools, you can use a docker container by using docker pull tsiotc/otc-client/ or you just use the openSUSE 42.1 or SLES12 public images which have the correct tools. The images can also be downloaded from the ImageFactory (as .qcow2) and be used as VMs on a local KVM or VirtualBox.

There is one difference between the config needed for OTC Prod environment compared to the PreProd environment. Replace the domain otc.t-systems.com with otctest.t-systems.com in all URLs.

So a working ~/.ostackrc (or openrc or novarc) looks like this on PreProd:

# Source this file to set the OTC Config
export OS_USERNAME="USERNUMBER OTC000000000010000XXXXXX"
#export OS_USER_DOMAIN_NAME="${OS_USERNAME##* }"
export OS_USER_DOMAIN_NAME="OTC000000000010000XXXXXX"
export OS_PASSWORD="THISisAverySECRETstring"
# Only change these for a different region
export OS_TENANT_NAME=eu-de 
export OS_PROJECT_NAME=eu-de
export OS_AUTH_URL=https://iam.eu-de.otctest.t-systems.com:443/v3
#export HTTPS_PROXY="https://160.44.194.54:8128"
# No changes needed beyond this point
export NOVA_ENDPOINT_TYPE=publicURL
export OS_ENDPOINT_TYPE=publicURL
export CINDER_ENDPOINT_TYPE=publicURL
export OS_VOLUME_API_VERSION=2
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2

The OS_PASSWORD needs to be retrieved as "API key" from the "My Credentials" IAM page in the Web Interface as described in the referenced articles. (Update: Since Jan 2017, the API key is the password that can be set to access the OTC directly without using MyWorkPlace. You can also use short OS_USERNAME by configuring local IAM users in OTC, so the OS_USER_DOMAIN_NAME then is no longer contained in the user name.)

Note that the API gateway is also protected via the whitelist.

This means that API calls are only possible from VMs in OTC PreProd and from whitelisted machines. The proxy is whitelisted and it has a working collection of client tools, so you can place calls there (if you have a shell account). You can also talk to the PreProd API gateway from Prod VMs through using the proxy, as it does not require authentication for that IP range. (Set the HTTPS_PROXY environment for this -- commented out in the above code snippet.)

The PreProd environment has an own Object Storage (S3) server that can be configured for tools such as s3 (from libs3) with

export S3_HOSTNAME=obs.eu-de.otctest.t-systems.com
export S3_ACCESS_KEY_ID=AKI
export S3_SECRET_ACCESS_KEY=SAK

AKI and SAK you can extract from the credentials.csv that can be downloaded when you generate such access keys in the IAM from the Web Interface. When accessing the PreProd object store from machines outside of PreProd which are not whitelisted, you can use the proxy if your tooling supports working with digest-authenticating proxies.

Note that the Web Interface to Object Storage is limited to small uploads (this is the same in the Production environment). You can use tools like s3cmd or s3 on Linux or S3Browser on Windows. Note that the OBSbrowser from Huawei (which is recommended in most Huawei docs) does not support digest auth to the proxy and can thus not be used on a remote machine that needs to talk to PreProd via the proxy.

Hints

A collection of useful hints (most of which apply to both Prod and PreProd).

The default security group does allow all outgoing traffic, but incoming traffic only from VMs of the same group. Please open TCP port 22 for the world (and ICMP echo if you like) to make use of ssh access with an injected key. Remember that the standard user on ImageFactory images is linux and ubuntu on the Ubuntu images.
The login via the noVNC remote console is really meant as an emergency access mechanism only. You can use the standard login with user linux and password cloud.1234 on ImageFactory linux images; this user also has full sudo power. ssh login with password authentication is of course not possible. You can and should change the password for the default user. We strongly recommend to not enable password login via ssh at all on VMs exposed to the internet, as there will be brute force attacks -- but if you really need to do it, please ensure you have changed the standard password as you will be trivially hackable otherwise. Since Dec 2016, the password is randomized; it is displayed on the noVNC console though as login hint, so you can still use this as emergency login method.
Outgoing internet access is only available for VMs with an EIP. We are working on an SNAT service that would allow VMs without EIP to have outgoing connections. Please do not work around by assigning an EIP to every VM -- public IPv4 addresses are a scarce resource. Probably the only resource on OTC whose price will rise over time. You can work around by setting up an http/https proxy (squid) and possibly ntp and DNS (dnsmasq) servers on a pair of JumpHosts with EIPs.
If you set up networks using neutron, you have more flexibility than the Web Interface currently offers. You can connect several disjoint CIDR ranges via a router (unlike the Web Interface's VPC) and you can have L2 networks spanning both AZs. (The latter is not normally recommended due to performance considerations.) The downside is that security groups and even VMs that are attached to such networks are currently not showing up properly in the Web Interface.
When booting a VM via the Web Interface or via the otcttools commands, you can easily chose a root disk that is larger than the minimum size we have registered the images with (4GB for most of the ImageFactory images). If you want to achieve the same using nova boot (or openstack server create), you need to create a bootable root volume with cinder first -- the one step VM creation with nova specifying the image does not offer you this level of flexibility. Even though the root filesystems are on persistent storage on OTC (if you reboot a VM, you will not get a fresh root filesystem on OTC) unlike on AWS, it is still best practice on clouds to treat root filesystems as ephemeral -- limit your persistence layer to object storage, databases or specific volumes (block storage) that are separate from your root disks. The root disk should be stateless -- this allows you to scale and to recover from failing VMs e.g. due to cold migration.
When hosts (hypervisors in OpenStack terminology) need maintenance, the running customer VMs are normally live-migrated. This means that customer applications keep running even if they are not cloud-ready and prepared to deal with cold migration. However, performance may be impacted for several minutes and there will be an interruption of several seconds. We require applications to deal with potential timeouts happening due to the interruption and reconnect to recover. It's difficult to achieve 100% success rate with live-migrations; there is a list of known bugs in guest OS kernels or drivers that can lead to occasional failures. While we have addressed the known issues in our public images, there are little guarantees for private images - we are currently preparing a document to list the known pitfalls and conditions. Furthermore, special instance types that have local disks (disk intensive flavor di.*) or SRIOV network acceleration (disk intensive and SAP HANA flavors) do not support live migration.

When hosts need to be rebooted (on Prod), we will thus inform customers in advance even though we perform live migrations for the VMs.
Keep your access data secure. Make sure you don't loose your SSHkey.pem, .ostackrc nor your credentials.csv files and ensure they are not disclosed to anyone (chmod 0600 is a good idea). To recover from mistakes here, you can reset your pasword or create a new AK/SK pair which will invalidate the old one, but your virtual environment might already be compromised if your key has gotten in the wrong hands.

It is also recommended to append a suffix (SSHkey-TENANT.pem, .ostackrc-TENANT.pem, ...) if you handle multiple tenants. otc-tools does facilitate this naming by reading the environment from ~/.ostackrc.$OTC_TENANT first if it exists.